并行编程技术在航天器飞行后轨迹重建蒙特卡罗仿真中的应用研究

Robert A. Williams, Justin S. Green
{"title":"并行编程技术在航天器飞行后轨迹重建蒙特卡罗仿真中的应用研究","authors":"Robert A. Williams, Justin S. Green","doi":"10.2514/6.2018-3431","DOIUrl":null,"url":null,"abstract":"Parallelizing software to execute on multi-core central processing units (CPUs) and graphics processing units (GPUs) can be challenging. For some fields outside of Computer Science, this transition comes with new issues. For example, memory limitations can require modifications to code not initially developed to run on GPUs. This work applies the Open Multi-Processing (OpenMP) and Open Accelerators (OpenACC) directive-based parallelization strategies on a Monte Carlo simulation approach for trajectory reconstruction enabling it to run on multi-core CPUs and GPUs. Large matrix operations are the most common use of GPUs, which are not present in this algorithm; however, the natural parallelism of independent trajectories in Monte Carlo simulations is exploited. Benchmarking data are presented comparing execution times of the software for single-thread CPUs, multi-thread CPUs with OpenMP, and multi-thread GPUs using OpenACC. These data were collected using nodes with Intel ® Xeon ® E5-2670 (Sandy Bridge) CPUs enhanced with NVIDIA ® Tesla ® K40 GPUs on the Pleiades Supercomputer cluster at the National Aeronautics and Space Administration (NASA) Ames Research Center (ARC) and a local Intel ® Xeon Phi ™ node at NASA Langley Research Center (LaRC). and orientation), and integrates the inertial measurement unit (IMU) data to determine the vehicle states throughout its flight. Lugo et al. 1 developed a Monte Carlo based approach for trajectory reconstruction that incorporated the vehicle’s final state information and introduces statistics. This method decreases uncertainties in the reconstruction results, which improves model validations and post-flight analysis. However, this Monte Carlo approach requires the integration of several thousand trajectories. These calculations are time consuming when executed serially, but the execution time can be decreased by utilizing concurrent computation. This paper examines the use of parallel programming techniques on an algorithm that applies inertial navigation to trajectory reconstruction in a Monte Carlo dispersion process. The two parallel programming techniques being utilized are OpenMP and OpenACC, which are used on multi-core CPUs and GPUs, respectively. Two studies are conducted to determine optimal performance based on thread count with OpenMP and register per thread for OpenACC. Additionally, comparisons are shown between three different compilers and three different types of hardware. or V100, will tested in future work.","PeriodicalId":326346,"journal":{"name":"2018 Modeling and Simulation Technologies Conference","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2018-06-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"An Investigation of Parallel Programming Techniques Applied to Monte Carlo Simulations for Post-Flight Reconstruction of Spacecraft Trajectory\",\"authors\":\"Robert A. Williams, Justin S. Green\",\"doi\":\"10.2514/6.2018-3431\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Parallelizing software to execute on multi-core central processing units (CPUs) and graphics processing units (GPUs) can be challenging. For some fields outside of Computer Science, this transition comes with new issues. For example, memory limitations can require modifications to code not initially developed to run on GPUs. This work applies the Open Multi-Processing (OpenMP) and Open Accelerators (OpenACC) directive-based parallelization strategies on a Monte Carlo simulation approach for trajectory reconstruction enabling it to run on multi-core CPUs and GPUs. Large matrix operations are the most common use of GPUs, which are not present in this algorithm; however, the natural parallelism of independent trajectories in Monte Carlo simulations is exploited. Benchmarking data are presented comparing execution times of the software for single-thread CPUs, multi-thread CPUs with OpenMP, and multi-thread GPUs using OpenACC. These data were collected using nodes with Intel ® Xeon ® E5-2670 (Sandy Bridge) CPUs enhanced with NVIDIA ® Tesla ® K40 GPUs on the Pleiades Supercomputer cluster at the National Aeronautics and Space Administration (NASA) Ames Research Center (ARC) and a local Intel ® Xeon Phi ™ node at NASA Langley Research Center (LaRC). and orientation), and integrates the inertial measurement unit (IMU) data to determine the vehicle states throughout its flight. Lugo et al. 1 developed a Monte Carlo based approach for trajectory reconstruction that incorporated the vehicle’s final state information and introduces statistics. This method decreases uncertainties in the reconstruction results, which improves model validations and post-flight analysis. However, this Monte Carlo approach requires the integration of several thousand trajectories. These calculations are time consuming when executed serially, but the execution time can be decreased by utilizing concurrent computation. This paper examines the use of parallel programming techniques on an algorithm that applies inertial navigation to trajectory reconstruction in a Monte Carlo dispersion process. The two parallel programming techniques being utilized are OpenMP and OpenACC, which are used on multi-core CPUs and GPUs, respectively. Two studies are conducted to determine optimal performance based on thread count with OpenMP and register per thread for OpenACC. Additionally, comparisons are shown between three different compilers and three different types of hardware. or V100, will tested in future work.\",\"PeriodicalId\":326346,\"journal\":{\"name\":\"2018 Modeling and Simulation Technologies Conference\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-06-24\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2018 Modeling and Simulation Technologies Conference\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.2514/6.2018-3431\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 Modeling and Simulation Technologies Conference","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.2514/6.2018-3431","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

在多核中央处理单元(cpu)和图形处理单元(gpu)上并行执行软件可能具有挑战性。对于计算机科学以外的一些领域来说,这种转变带来了新的问题。例如,内存限制可能需要修改最初不是为在gpu上运行而开发的代码。这项工作将基于Open Multi-Processing (OpenMP)和Open Accelerators (OpenACC)指令的并行化策略应用于轨迹重建的蒙特卡罗模拟方法,使其能够在多核cpu和gpu上运行。大矩阵运算是gpu最常见的使用,而在本算法中不存在;然而,在蒙特卡罗模拟中,独立轨迹的自然并行性被利用。给出了软件在单线程cpu、使用OpenMP的多线程cpu和使用OpenACC的多线程gpu上的执行时间的基准测试数据。这些数据是在美国国家航空航天局(NASA)艾姆斯研究中心(ARC)的Pleiades超级计算机集群上使用Intel®Xeon®E5-2670 (Sandy Bridge) cpu和NVIDIA®Tesla®K40 gpu增强的节点和NASA兰利研究中心(LaRC)的本地Intel®Xeon Phi™节点收集的。和方向),并集成惯性测量单元(IMU)数据来确定飞行器在整个飞行过程中的状态。Lugo等人1开发了一种基于蒙特卡罗的轨迹重建方法,该方法结合了车辆的最终状态信息并引入了统计信息。该方法减少了重建结果中的不确定性,提高了模型验证和飞后分析的质量。然而,这种蒙特卡罗方法需要对几千个轨迹进行积分。这些计算在串行执行时非常耗时,但是通过使用并发计算可以减少执行时间。本文研究了在蒙特卡罗色散过程中应用惯性导航进行轨迹重建的算法上使用并行编程技术。所使用的两种并行编程技术是OpenMP和OpenACC,它们分别用于多核cpu和gpu。为了确定基于OpenMP的线程数和OpenACC的每线程寄存器的最佳性能,进行了两项研究。此外,还比较了三种不同的编译器和三种不同类型的硬件。或V100,将在未来的工作中进行测试。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
An Investigation of Parallel Programming Techniques Applied to Monte Carlo Simulations for Post-Flight Reconstruction of Spacecraft Trajectory
Parallelizing software to execute on multi-core central processing units (CPUs) and graphics processing units (GPUs) can be challenging. For some fields outside of Computer Science, this transition comes with new issues. For example, memory limitations can require modifications to code not initially developed to run on GPUs. This work applies the Open Multi-Processing (OpenMP) and Open Accelerators (OpenACC) directive-based parallelization strategies on a Monte Carlo simulation approach for trajectory reconstruction enabling it to run on multi-core CPUs and GPUs. Large matrix operations are the most common use of GPUs, which are not present in this algorithm; however, the natural parallelism of independent trajectories in Monte Carlo simulations is exploited. Benchmarking data are presented comparing execution times of the software for single-thread CPUs, multi-thread CPUs with OpenMP, and multi-thread GPUs using OpenACC. These data were collected using nodes with Intel ® Xeon ® E5-2670 (Sandy Bridge) CPUs enhanced with NVIDIA ® Tesla ® K40 GPUs on the Pleiades Supercomputer cluster at the National Aeronautics and Space Administration (NASA) Ames Research Center (ARC) and a local Intel ® Xeon Phi ™ node at NASA Langley Research Center (LaRC). and orientation), and integrates the inertial measurement unit (IMU) data to determine the vehicle states throughout its flight. Lugo et al. 1 developed a Monte Carlo based approach for trajectory reconstruction that incorporated the vehicle’s final state information and introduces statistics. This method decreases uncertainties in the reconstruction results, which improves model validations and post-flight analysis. However, this Monte Carlo approach requires the integration of several thousand trajectories. These calculations are time consuming when executed serially, but the execution time can be decreased by utilizing concurrent computation. This paper examines the use of parallel programming techniques on an algorithm that applies inertial navigation to trajectory reconstruction in a Monte Carlo dispersion process. The two parallel programming techniques being utilized are OpenMP and OpenACC, which are used on multi-core CPUs and GPUs, respectively. Two studies are conducted to determine optimal performance based on thread count with OpenMP and register per thread for OpenACC. Additionally, comparisons are shown between three different compilers and three different types of hardware. or V100, will tested in future work.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Withdrawal: Machine Learning Algorithms To Improve Model Accuracy and Latency, and Human-Autonomy Teaming Correction: Shimmy Simulation and Virtual Verification for Amphibious Aircraft’s Landing Gear Airworthiness Certification Correction: An Adaptive Sequential Experiment Design Method for Metamodeling Withdrawal: Study on risk assessment of civil aircraft flight control system failure Withdrawal: Applications of Derived Grey Model for Complex System Forecasting
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1