high performance computing on graphics processing units: hgpu.org

hgpu.org » Applications » Computer science » ConTraPh: Contrastive Learning for Parallelization and Performance Optimization

ConTraPh: Contrastive Learning for Parallelization and Performance Optimization

Quazi Ishtiaque Mahmud, Ali TehraniJamsaz, Nesreen K. Ahmed, Theodore L. Willke, Ali Jannesari

Iowa State University, Ames, Iowa, USA

ACM International Conference on Supercomputing, 2025

@article{mahmud2025contraph,

title={ConTraPh: Contrastive Learning for Parallelization and Performance Optimization},

author={Mahmud, Quazi Ishtiaque and TehraniJamsaz, Ali and Ahmed, Nesreen K and Willke, Theodore L and Jannesari, Ali},

year={2025}

}

Download (PDF)

View

Source

4337

views

With the advancement of HPC platforms, the demand for high-performing applications continues to grow. One effective way to enhance program performance is through parallelization. However, fully leveraging the powerful hardware of HPC platforms poses significant challenges. Even experienced developers must carefully consider factors such as runtime, memory usage, and thread-scheduling overhead. Additionally, achieving successful parallelization often requires running applications to determine the optimal configurations. In this paper, we propose ConTraPh, a framework that integrates Contrastive Learning with Transformers and Graph Neural Networks to capture the inherent parallel characteristics of source programs through a multi-view program representation, utilizing both source code and compiler intermediate representations. This contrastive learning framework allows the model to effectively learn correct parallel configurations from positive samples while avoiding incorrect ones through negative samples. We evaluate ConTraPh on six downstream tasks involving three different parallel programming models OpenMP, OpenCL and, OpenACC that include OpenMP clause prediction, performant reduction style detection, performant scheduling type detection, CPU/GPU parallelism prediction, Heterogeneous Device Mapping for OpenCL code, and OpenACC clause prediction. ConTraPh outperforms state-of-the-art models in these tasks, achieving accuracy improvements of up to 8%, 10%, 7%, 4%, 2%, and 9%, respectively. ConTraPh achieves speedups as high as 13x, 18x, 14x, and 4.4x on the reduction style detention, scheduling type detection, CPU/GPU parallelism prediction, and Heterogeneous Decive Mapping tasks, respectively. Results also demonstrate that ConTraPh can be integrated with third-party tools, such as large language models (LLMs), to enhance the performance of state-of-theart models like GPT-4 by up to 15% based on established code generation metrics, such as CodeBERTScore.

Tags: Code generation, Computer science, Heterogeneous systems, HPC, LLM, nVidia, OpenACC, OpenCL, OpenMP, Tesla V100

August 10, 2025 by hgpu

No votes yet.

Please wait...

Your response

You must be logged in to post a comment.

high performance computing on graphics processing units: hgpu.org

ConTraPh: Contrastive Learning for Parallelization and Performance Optimization

Your response

Recent source codes

CudaForge: An Agent Framework with Hardware Feedback for CUDA Kernel Optimization

LC Framework

pplx-garden: Perplexity open source garden for inference technology

Atlas CLI: Machine Learning (ML) Lifecycle & Transparency Manager

transformers_tvm: Implementation of Encoder Decoder transformer on TVM

OpScanner

INT v.s. FP: A framework to compare low-bit integer and float-point formats

AutoDock-GPU: AutoDock for GPUs and other accelerators

NCCLX: collective communication framework

Tutoring LLM into a Better CUDA Optimizer

Most viewed papers (last 30 days)

ConTraPh: Contrastive Learning for Parallelization and Performance Optimization

Share this:

Your response

Recent source codes

Most viewed papers (last 30 days)