high performance computing on graphics processing units: hgpu.org

hgpu.org » Applications » Computer science » Optimizing the exploitation of multicore processors and GPUs with OpenMP and OpenCL

Optimizing the exploitation of multicore processors and GPUs with OpenMP and OpenCL

Roger Ferrer, Judit Planas, Pieter Bellens, Alejandro Duran, Marc Gonzalez, Xavier Martorell, Rosa M. Badia, Eduard Ayguade, Jesus Labarta

Barcelona Supercomputing Center, Barcelona, Spain

Languages and Compilers for Parallel Computing, Lecture Notes in Computer Science, 2011, Volume 6548/2011, 215-229

DOI:10.1007/978-3-642-19595-2_15

BibTeX

Source

1836

views

In this paper, we present OMPSs, a programming model based on OpenMP and StarSs, that can also incorporate the use of OpenCL or CUDA kernels. We evaluate the proposal on three different architectures, SMP, Cell/B.E. and GPUs, showing the wide usefulness of the approach. The evaluation is done with four different benchmarks, Matrix Multiply, BlackScholes, Perlin Noise, and Julia Set. We compare the results obtained with the execution of the same benchmarks written in OpenCL, in the same architectures. The results show that OMPSs greatly outperforms the OpenCL environment. It is more flexible to exploit multiple accelerators. And due to the simplicity of the annotations, it increases programmer’s productivity.

Tags: Cell processor, Computer science, CUDA, GPU cluster, nVidia, OpenCL, OpenMP, Programming techniques

August 18, 2011 by hgpu

No votes yet.

Please wait...

Your response

You must be logged in to post a comment.

* * *

high performance computing on graphics processing units: hgpu.org

Optimizing the exploitation of multicore processors and GPUs with OpenMP and OpenCL

Your response

Recent source codes

Mutual-Supervised Learning for Sequential-to-Parallel Code Translation

Hardware Compute Partitioning on NVIDIA GPUs for Composable Systems

KISim: Kubernetes Intelligent Scheduling Simulator

Efficient GPU Implementation of Multi-Precision Integer Division

exa-AMD: Exascale Accelerated Materials Discovery

ParEval: A Parallel Code Evaluation Benchmark

FlashSparse: Minimizing Computation Redundancy for Fast Sparse Matrix Multiplications on Tensor Cores

WiLLM: An Open Wireless LLM Communication System

Vcc: the Vulkan Clang Compiler

hpcbench: A set of benchmarking utilities for biomolecular simulation tools

Most viewed papers (last 30 days)

Optimizing the exploitation of multicore processors and GPUs with OpenMP and OpenCL

Share this:

Your response

Recent source codes

Most viewed papers (last 30 days)