high performance computing on graphics processing units: hgpu.org

hgpu.org » Applications » Computer science » A Performance Comparison of CUDA and OpenCL

A Performance Comparison of CUDA and OpenCL

Kamran Karimi, Neil G. Dickson, Firas Hamze

D-Wave Systems Inc., 100-4401 Still Creek Drive, Burnaby, British Columbia, Canada, V5C 6G9

arXiv:1005.2581v2 [cs.PF] (14 May 2010)

BibTeX

Download (PDF)

View

Source

Package:

AQUA@home

3113

views

CUDA and OpenCL offer two different interfaces for programming GPUs. OpenCL is an open standard that can be used to program CPUs, GPUs, and other devices from different vendors, while CUDA is specific to NVIDIA GPUs. Although OpenCL promises a portable language for GPU programming, its generality may entail a performance penalty. In this paper, we compare the performance of CUDA and OpenCL using complex, near-identical kernels. We show that when using NVIDIA compiler tools, converting a CUDA kernel to an OpenCL kernel involves minimal modifications. Making such a kernel compile with ATI’s build tools involves more modifications. Our performance tests measure and compare data transfer times to and from the GPU, kernel execution times, and end-to-end application execution times for both CUDA and OpenCL.

Tags: Computational Physics, Computer science, CUDA, nVidia, nVidia GeForce GTX 260, OpenCL, Package, Performance

November 5, 2010 by hgpu

Rating: 1.5/5. From 2 votes.

Please wait...

hpcbench: A set of benchmarking utilities for biomolecular simulation tools

Engineering Supercomputing Platforms for Biomolecular Applications

high performance computing on graphics processing units: hgpu.org

A Performance Comparison of CUDA and OpenCL

Package:

Recent source codes

hpcbench: A set of benchmarking utilities for biomolecular simulation tools

HPCTransCompile: An AI Compiler Generated Dataset for High-Performance CUDA Transpilation and LLM Preliminary Exploration

chemtrain: Training Molecular Dynamics Potentials in JAX

microSYCL: SYCL micro-benchmarks repository

XaaS containers

CASS: Cuda-Amd aSSembly

Cluser of smartphones for edge computing application using TensorFlow

SYCL Container

Efficient Graph Embedding at Scale: Optimizing CPU-GPU-SSD Integration

Can Large Language Models Predict Parallel Code Performance?

Most viewed papers (last 30 days)

A Performance Comparison of CUDA and OpenCL

Package:

Share this:

Recent source codes

Most viewed papers (last 30 days)