Aurora Exascale Early Science Program

Last updated July 27, 2022 Enabling code with performance portable GPU acceleration without interrupting scientific production.

0.1 Aurora Exascale Early Science Program Project

Collaborators: University of Southern California (Aiichiro Nakano and Ken-ichi Nomura), USC CARC (Marco Olguin), Argonne National Library (Timothy Williams and Ye Luo), and Intel

The arrival of graphics processing unit (GPU)-based systems at the Argonne Leadership Computing Facility (ALCF) required teams of researchers to collaborate on porting their code, much of which was written exclusively for central processing unit (CPU)-based machines.

The Aurora Early Science Program (ESP) team developed an application called QXMD, a Fortran-based scalable quantum molecular dynamics code, during a previous project titled Metascalable Layered Materials Genome.

After assessing the computational profile of QXMD, the team decided to improve the abstraction of the code by reorganizing the most computationally intensive parts into internal modules which could be separately developed and validated. With this approach, the team created a mini-application, Local Field Dynamics (LFD), which computes many-electron dynamics in the framework of real-time time-dependent density functional theory and represents one of the most computationally expensive QXMD kernels.

LFD is written in C++ (as opposed to Fortran) and was designed as a plugin, allowing it to be configured independently of the greater QXMD. To enable successful portability between different platforms, OpenMP GPU offload capability for LFD was developed by using the IBM XL and LLVM Clang compilers on NVIDIA GPUs. With this setup, the team could use their code to configure Intel software in terms of both capability and performance on Intel-integrated and discrete GPUs.

The collaboration among Argonne National Library, the University of Southern California, and Intel has contributed to the Aurora software stack’s quick maturation to production quality for execution on Exascale HPC resources.