Large Language Model Powered C-to-CUDA Code Translation: A Novel Auto-Parallelization Framework
Wuhan University
SSRN, 2025
CUDA (Compute Unified Device Architecture) parallel programming significantly improves computational efficiency across multiple fields. However, converting serial C code to CUDA poses challenges for non-experts, and traditional tools struggle with complex patterns. While LLMs (Large Language Models) enable automatic parallelization of complex patterns, they may generate CUDA code with synchronization and memory management issues. There are no systematic methods to evaluate the quality of translated CUDA code effectively. To address these issues, we propose CUDA-LLMC (C-to-CUDA code translation framework enhanced with Large Language Model Correction), a new auto-parallel paradigm and model-agnostic framework. Our framework employs a closed-loop process with multiple verification stages to reduce errors and enhance the reliability of the generated code, ensuring adherence to CUDA syntax characteristics and maintaining memory and thread safety. It contains three modules: translation, verification, and correction. First, we employee an LLM to generate the initial code, which is then verified using various tools. Feedback from these tools is provided back to the LLM, while a process control component in the correction module prevents degradation in subsequent iterations and minimizes costs. Moreover, we introduce CudaBLEU (CUDA Bilingual Evaluation Understudy), a new metric designed to evaluate the quality of generated code in terms of parallelization and CUDA-specific features. Experiments demonstrate our method’s effectiveness, with generated code not only better aligning with CUDA-specific features but also showing superior acceleration ratios (at least 45×) in scientific computing for remote sensing, meteorology, and healthcare.
April 13, 2025 by hgpu