Anthony Cowley, Camillo J. Taylor
We address the increasingly varied capabilities of specialized computing platforms by introducing a growing family of functionally-limited mini-languages, implemented as embedded domain specific languages (EDSLs) in Haskell, that may be composed to harness the computational features offered by a variety of hardware platforms. This development is based on a novel modular representation of the EDSL […]
View View   Download Download (PDF)   
Li-Wen Chang, Abdul Dakkak, Christopher I. Rodrigues, Wen-mei Hwu
We propose Tangram, a general-purpose high-level language that achieves high performance across architectures. In Tangram, a program is written by synthesizing elemental pieces of code snippets, called codelets. A codelet can have multiple semantic-preserving implementations to enable automated algorithm and implementation selection. An implementation of a codelet can be written with tunable knobs to allow […]
View View   Download Download (PDF)   
Robert Senser, Tom Altman
DEFG is our declarative language and framework for the efficient generation of OpenCL GPU applications. Using our new DEFG implementation, run-time and lines-of-code comparisons are provided for three well-known algorithms: Sobel image filtering, breadth-first search and all-pairs shortest path. The DEFG declarative language and corresponding OpenCL kernels provide complete OpenCL applications. The lines-of-code comparison demonstrates […]
View View   Download Download (PDF)   
N. Tuck
This dissertation explores just-in-time (JIT) specialization as an optimization for OpenCL data-parallel compute kernels. It describes the implementation and performance of two extensions to OpenCL, Bacon and Specialization Annotated OpenCL (SOCL). Bacon is a replacement interface for OpenCL that provides improved usability and has JIT specialization built in. SOCL is a simple extension to OpenCL […]
View View   Download Download (PDF)   
Yaakoub El-Khamra, Niall Gaffney, David Walling, Eric Wernert, Weijia Xu, Hui Zhang
Over the years, R has been adopted as a major data analysis and mining tool in many domain fields. As Big Data overwhelms those fields, the computational needs and workload of existing R solutions increases significantly. With recent hardware and software developments, it is possible to enable massive parallelism with existing R solutions with little […]
Thierry Coppey
Dynamic programming is an algorithmic technique to solve problems that follow the Bellman’s principle: optimal solutions depends on optimal sub-problem solutions. The core idea behind dynamic programming is to memoize intermediate results into matrices to avoid multiple computations. Solving a dynamic programming problem consists of two phases: filling one or more matrices with intermediate solutions […]
Carlos Alberto Martinez Angeles
Datalog is a language based on first order logic that was investigated as a data model for relational databases in the 1980s. It has recently been used in various new application areas, prompting proposals to run Datalog programs on new platforms such as Graphics Processing Units (GPUs) and MapReduce. Back then and nowadays, interest in […]
View View   Download Download (PDF)   
Swadhin K. Mangaraj
SkePU (Skeleton Programming Framework for Multi-core CPU and Multi-GPU Systems) is a parallel computing framework developed by Johan Enmyren and Christoph Kessler at Linkopings Universitet. This C++ template library provides a simple and unified interface for specifying data-parallel computations with the help of skeletons and is targeted to multiple backends e.g. for a sequential CPU, […]
View View   Download Download (PDF)   
John Canny, David Hall, Dan Klein
Constituency parsing with rich grammars remains a computational challenge. Graphics Processing Units (GPUs) have previously been used to accelerate CKY chart evaluation, but gains over CPU parsers were modest. In this paper, we describe a collection of new techniques that enable chart evaluation at close to the GPU’s practical maximum speed (a Teraflop), or around […]
Mathias Bourgoin, Emmanuel Chailloux, Jean-Luc Lamotte
We present an OCaml GPGPU library with a DSL embedded into OCaml to express GPGPU kernels. The level of performance achieved is measured through different examples. We also discuss the use of GPGPU programming to increase the performance of multicore-CPUs software, written in OCaml.
View View   Download Download (PDF)   
Tom Henretty, Justin Holewinski, Richard Veras, Franz Franchetti, Louis-Noel Pouchet, J. Ramanujam, Atanas Rountev, P. Sadayappan
Stencil computations are an integral part of applications in a number of scientific computing domains, such as image processing and partial differential equations. We describe a domain-specific language for regular stencil computations, that allows specification of the computations in a concise manner. We describe a multi-target compiler for this DSL, that generates optimized code for […]
View View   Download Download (PDF)   
Martin Griffin, Aidan Delaney, Karina Rodriguez Echavarria
We record and digitally preserve public monuments by describing them using a generative language. Such generative models allow visualisations of the monuments to be computed from descriptions that are suitable for transmission to personal mobile devices. The intent of using a personal mobile device is to make the cultural heritage experience more personal to the […]
Page 1 of 912345...Last »

* * *

* * *

Like us on Facebook

HGPU group

229 people like HGPU on Facebook

Follow us on Twitter

HGPU group

1425 peoples are following HGPU @twitter

* * *

Free GPU computing nodes at hgpu.org

Registered users can now run their OpenCL application at hgpu.org. We provide 1 minute of computer time per each run on two nodes with two AMD and one nVidia graphics processing units, correspondingly. There are no restrictions on the number of starts.

The platforms are

Node 1
  • GPU device 0: nVidia GeForce GTX 560 Ti 2GB, 822MHz
  • GPU device 1: AMD/ATI Radeon HD 6970 2GB, 880MHz
  • CPU: AMD Phenom II X6 @ 2.8GHz 1055T
  • RAM: 12GB
  • OS: OpenSUSE 13.1
  • SDK: nVidia CUDA Toolkit 6.5.14, AMD APP SDK 3.0
Node 2
  • GPU device 0: AMD/ATI Radeon HD 7970 3GB, 1000MHz
  • GPU device 1: AMD/ATI Radeon HD 5870 2GB, 850MHz
  • CPU: Intel Core i7-2600 @ 3.4GHz
  • RAM: 16GB
  • OS: OpenSUSE 12.3
  • SDK: AMD APP SDK 3.0

Completed OpenCL project should be uploaded via User dashboard (see instructions and example there), compilation and execution terminal output logs will be provided to the user.

The information send to hgpu.org will be treated according to our Privacy Policy

HGPU group © 2010-2015 hgpu.org

All rights belong to the respective authors

Contact us: