high performance computing on graphics processing units: hgpu.org

hgpu.org » Applications » Computer science » Comparing Llama-2 and GPT-3 LLMs for HPC kernels generation

Comparing Llama-2 and GPT-3 LLMs for HPC kernels generation

Pedro Valero-Lara, Alexis Huante, Mustafa Al Lail, William F. Godoy, Keita Teranishi, Prasanna Balaprakash, Jeffrey S. Vetter

Oak Ridge National Laboratory, Oak Ridge, TN, 37830, USA

arXiv:2309.07103 [cs.SE], (12 Sep 2023)

DOI:10.48550/arXiv.2309.07103

BibTeX

Download (PDF)

View

Source

Source codes

Package:

Code generated HPC Kernels Using Llama-2-70b

1151

views

We evaluate the use of the open-source Llama-2 model for generating well-known, high-performance computing kernels (e.g., AXPY, GEMV, GEMM) on different parallel programming models and languages (e.g., C++: OpenMP, OpenMP Offload, OpenACC, CUDA, HIP; Fortran: OpenMP, OpenMP Offload, OpenACC; Python: numpy, Numba, pyCUDA, cuPy; and Julia: Threads, CUDA.jl, AMDGPU.jl). We built upon our previous work that is based on the OpenAI Codex, which is a descendant of GPT-3, to generate similar kernels with simple prompts via GitHub Copilot. Our goal is to compare the accuracy of Llama-2 and our original GPT-3 baseline by using a similar metric. Llama-2 has a simplified model that shows competitive or even superior accuracy. We also report on the differences between these foundational large language models as generative AI continues to redefine human-computer interactions. Overall, Copilot generates codes that are more reliable but less optimized, whereas codes generated by Llama-2 are less reliable but more optimized when correct.

Tags: Code generation, Computer science, CUDA, Deep learning, Fortran, HIP, HPC, nVidia, OpenACC, OpenMP, Package, Python

September 17, 2023 by hgpu

No votes yet.

Please wait...

hpcbench: A set of benchmarking utilities for biomolecular simulation tools

Engineering Supercomputing Platforms for Biomolecular Applications

high performance computing on graphics processing units: hgpu.org

Comparing Llama-2 and GPT-3 LLMs for HPC kernels generation

Package:

Recent source codes

hpcbench: A set of benchmarking utilities for biomolecular simulation tools

HPCTransCompile: An AI Compiler Generated Dataset for High-Performance CUDA Transpilation and LLM Preliminary Exploration

chemtrain: Training Molecular Dynamics Potentials in JAX

microSYCL: SYCL micro-benchmarks repository

XaaS containers

CASS: Cuda-Amd aSSembly

Cluser of smartphones for edge computing application using TensorFlow

SYCL Container

Efficient Graph Embedding at Scale: Optimizing CPU-GPU-SSD Integration

Can Large Language Models Predict Parallel Code Performance?

Most viewed papers (last 30 days)

Comparing Llama-2 and GPT-3 LLMs for HPC kernels generation

Package:

Share this:

Recent source codes

Most viewed papers (last 30 days)