high performance computing on graphics processing units: hgpu.org

Posts

Dec, 22

PyFAI: a Python library for high performance azimuthal integration on GPU

The pyFAI package has been designed to reduce X-ray diffraction images into powder diffraction curves to be further processed by scientists. This contribution describes how to convert an image into a radial profile using the Numpy package, how the process was accelerated using Cython. The algorithm was parallelised, needing a complete re-design to benefit from […]

OpenCL

Jun, 2

Loo.py: transformation-based code generation for GPUs and CPUs

Today’s highly heterogeneous computing landscape places a burden on programmers wanting to achieve high performance on a reasonably broad cross-section of machines. To do so, computations need to be expressed in many different but mathematically equivalent ways, with, in the worst case, one variant per target machine. Loo.py, a programming system embedded in Python, meets […]

OpenCL

Aug, 26

OpenCL programming using Python syntax

We describe ocl, a Python library built on top of pyOpenCL and numpy. It allows programming GPU devices using Python. Python functions which are marked up using the provided decorator, are converted into C99/OpenCL and compiled using the JIT at runtime. This approach lowers the barrier to entry to programming GPU devices since it requires […]

OpenCL

Jun, 14

Parakeet: A Just-In-Time Parallel Accelerator for Python

High level productivity languages such as Python or Matlab enable the use of computational resources by nonexpert programmers. However, these languages often sacrifice program speed for ease of use. This paper proposes Parakeet, a library which provides a just-in-time (JIT) parallel accelerator for Python. Parakeet bridges the gap between the usability of Python and the […]

CUDA

Jan, 16

Theano: Deep Learning on GPUs with Python

In this paper, we present Theano, a framework in the Python programming language for defining, optimizing and evaluating expressions involving high-level operations on tensors. Theano offers most of NumPy’s functionality, but adds automatic symbolic differentiation, GPU support, and faster expression evaluation. Theano is a general mathematical tool, but it was developed with the goal of […]

CUDA

Dec, 12

Theano: A CPU and GPU Math Compiler in Python

Theano is a compiler for mathematical expressions in Python that combines the convenience of NumPy’s syntax with the speed of optimized native machine language. The user composes mathematical expressions in a high-level description that mimics NumPy’s syntax and semantics, while being statically typed and functional (as opposed to imperative). These expressions allow Theano to provide […]

CUDA

Dec, 11

Generalizing Execution of Vectorizable Computations by Generating Vector Oriented Byte Code

Computer simulations, which are widely used in both academia and in the industry, often work on very large data sets. This makes them well suited for harvesting the computing power of modern, highly parallel computing systems, such as GPU’s, clusters and vector processors. The challenge lies in the fact, that these systems must be programmed […]

CUDA

Oct, 2

A Complete Descritpion of the UnPython and Jit4GPU Framework

A new compilation framework enables the execution of numerical-intensive applications in an execution environment that is formed by multi-core Central Processing Units (CPUs) and Graphics Processing Units (GPUs). A critical innovation is the use of a variation of Linear Memory Access Descriptors (LMADs) to analyze loop nests and determine automatically which memory locations must be […]

Aug, 21

Compiling Python to a hybrid execution environment

A new compilation framework enables the execution of numerical-intensive applications, written in Python, on a hybrid execution environment formed by a CPU and a GPU. This compiler automatically computes the set of memory locations that need to be transferred to the GPU, and produces the correct mapping between the CPU and the GPU address spaces. […]

OpenCL

* * *

high performance computing on graphics processing units: hgpu.org

Posts

PyFAI: a Python library for high performance azimuthal integration on GPU

Loo.py: transformation-based code generation for GPUs and CPUs

OpenCL programming using Python syntax

Parakeet: A Just-In-Time Parallel Accelerator for Python

Theano: Deep Learning on GPUs with Python

Theano: A CPU and GPU Math Compiler in Python

Generalizing Execution of Vectorizable Computations by Generating Vector Oriented Byte Code

A Complete Descritpion of the UnPython and Jit4GPU Framework

Compiling Python to a hybrid execution environment

Recent source codes

Superpipeline: A Universal Approach for Reducing GPU Memory Usage in Large Models

EnergyUCB-Bandit

Effects of OpenCL-Based Parallelization Methods on Explicit Numerical Methods to Solve the Heat Equation

Faial: finds bugs in CUDA kernels

Intel® SHMEM: Device initiated shared memory based communication library

miniLB: Lattice Botlzmann miniapp w/SYCL

AFOCL

2domination

MFC: Exascale simulation of multiphase/physics fluid dynamics

UVaFTLE: Lagrangian finite time Lyapunov exponent extraction for fluid dynamic applications

Most viewed papers (last 30 days)