Enabling CP2K Application for Exascale Computing with Accelerators using OpenACC and OpenCL
WCSS, Wroclaw University of Technology, Wyb. Wyspianskiego 27, 50-370 Wroclaw, Poland
PRACE, 2014
@article{uchronski2014enabling,
title={Enabling CP2K Application for Exascale Computing with Accelerators using OpenACC and OpenCL},
author={Uchronski, Mariusz and Kwiecien, Agnieszka and Gebarowski, Marcin},
year={2014}
}
CP2K is an application for atomistic and molecular simulation and, with its excellent scalability, is particularly important with regards to use on future exascale systems. The code is well parallelized using MPI and hybrid MPI/OpenMP, typically scaling well to ~1 core per atom in the system. The research on CP2K done within PRACE-1IP stated that due to heavy usage of sparse matrix multiplication for large systems, there is a place for improvement of performance. The main goal of this work, undertaken within PRACE-3IP, was to investigate the most time-consuming routines and port them to accelerators, particularly GPGPUs. The relevant areas of the code that can be effectively accelerated are the matrix multiplications (DBCSR library). A significant amount of work has already been done on DBCSR library using CUDA. We focused on enabling the library on a potentially wider range of computing resources using OpenCL and OpenACC technologies, to bring the overall application closer to exascale. We introduce the ports and promising performance results. The work done has led to the identification of a number of issues with using OpenACC in CP2K, which need to be further investigated and resolved to make the application and technology work better together.
May 16, 2014 by hgpu