SC20 Proceedings

The International Conference for High Performance Computing, Networking, Storage, and Analysis

Performance Assessment of OpenMP Compilers Targeting NVIDIA V100 GPUs


Workshop:WACCPD 2020: Seventh Workshop on Accelerator Programming Using Directives

Authors: Joshua H. Davis (University of Delaware), Christopher Daley (Lawrence Berkeley National Laboratory), Swaroop Pophale (Oak Ridge National Laboratory), Thomas Huber and Sunita Chandrasekaran (University of Delaware), and Nicholas Wright (Lawrence Berkeley National Laboratory)


Abstract: Heterogeneous systems are becoming increasingly prevalent. In order to exploit the rich compute resources of such systems, robust programming models are needed for application developers to seamlessly migrate legacy code from today's systems to tomorrow's. Over the past decade and more, directives have been established as one of the promising paths to tackle programmatic challenges on emerging systems. This work focuses on applying and demonstrating OpenMP offloading directives on five proxy applications. We observe that the performance varies widely from one compiler to the other; a crucial aspect of our work is reporting best practices to application developers who use OpenMP offloading compilers. While the developer can work around some issues, there are other issues that must be reported to the compiler vendors. By restructuring OpenMP offloading directives, we gain an 18x speedup for the su3 proxy application on NERSC's Cori system when using the Clang compiler, and a 15.7x speedup by switching max reductions to add reductions in the Laplace mini-app when using the Cray-llvm compiler on Cori.





Back to WACCPD 2020: Seventh Workshop on Accelerator Programming Using Directives Archive Listing



Back to Full Workshop Archive Listing