SC20 Proceedings

The International Conference for High Performance Computing, Networking, Storage, and Analysis

Performance Analysis of a Quantum Monte Carlo Application on Multiple Hardware Architectures Using the HPX Runtime


Workshop:11th Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems

Authors: Weile Wei (Louisiana State University), Arghya Chatterjee (Oak Ridge National Laboratory (ORNL)), Kevin Huck (University of Oregon), Oscar Hernandez (Oak Ridge National Laboratory), and Hartmut Kaiser (Louisiana State University)


Abstract: This paper describes how we successfully used the HPX programming model to port the DCA++ application on multiple architectures that include POWER9, x86, ARM v8, and NVIDIA GPUs. We describe the lessons we can learn from this experience as well as the benefits of enabling the HPX in the application to improve the CPU threading part of the code, which led to an overall 21% improvement across architectures. We also describe how we used HPX-APEX to raise the level of abstraction to understand performance issues and to identify tasking optimization opportunities in the code, and how these relate to CPU/GPU utilization counters, device memory allocation over time, and CPU kernel level context switches on a given architecture.





Back to 11th Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems Archive Listing



Back to Full Workshop Archive Listing