Understanding the Power of Evolutionary Computation for GPU Code Optimization

hgpu.org » Applications » Biology » Understanding the Power of Evolutionary Computation for GPU Code Optimization

Understanding the Power of Evolutionary Computation for GPU Code Optimization

Jhe-Yu Liou, Muaaz Awan, Steven Hofmeyr, Stephanie Forrest, Carole-Jean Wu

Arizona State University, Tempe, AZ

arXiv:2208.12350 [cs.SE], (25 Aug 2022)

DOI:10.48550/arXiv.2208.12350

@misc{https://doi.org/10.48550/arxiv.2208.12350,

doi={10.48550/ARXIV.2208.12350},

url={https://arxiv.org/abs/2208.12350},

author={Liou, Jhe-Yu and Awan, Muaaz and Hofmeyr, Steven and Forrest, Stephanie and Wu, Carole-Jean},

keywords={Software Engineering (cs.SE), Distributed, Parallel, and Cluster Computing (cs.DC), Performance (cs.PF), FOS: Computer and information sciences, FOS: Computer and information sciences},

title={Understanding the Power of Evolutionary Computation for GPU Code Optimization},

publisher={arXiv},

year={2022},

}

Download (PDF)

View

Source

Source codes

Package:

GEVO: GPU code optimization using EVOlutionary computation

1744

views

Achieving high performance for GPU codes requires developers to have significant knowledge in parallel programming and GPU architectures, and in-depth understanding of the application. This combination makes it challenging to find performance optimizations for GPU-based applications, especially in scientific computing. This paper shows that significant speedups can be achieved on two quite different scientific workloads using the tool, GEVO, to improve performance over human-optimized GPU code. GEVO uses evolutionary computation to find code edits that improve the runtime of a multiple sequence alignment kernel and a SARS-CoV-2 simulation by 28.9% and 29% respectively. Further, when GEVO begins with an early, unoptimized version of the sequence alignment program, it finds an impressive 30 times speedup — a performance improvement similar to that of the hand-tuned version. This work presents an in-depth analysis of the discovered optimizations, revealing that the primary sources of improvement vary across applications; that most of the optimizations generalize across GPU architectures; and that several of the most important optimizations involve significant code interdependencies. The results showcase the potential of automated program optimization tools to help reduce the optimization burden for scientific computing developers and enhance performance portability for domain-specific accelerators.

Tags: Bioinformatics, Biology, Computer science, CUDA, Evolutionary Computations, Neural and Evolutionary Computing, nVidia, nVidia GeForce GTX 1080 Ti, Package, performance portability, Sequence alignment, Tesla P100, Tesla V100

September 4, 2022 by hgpu

No votes yet.

Please wait...

Your response

You must be logged in to post a comment.

* * *

high performance computing on graphics processing units: hgpu.org