5751

Posts

Sep, 24

Functional Signal Processing with Pure and Faust Using the LLVM Toolkit

Pure and Faust are two functional programming languages useful for programming computer music and other multimedia applications. Faust is a domain-specific language specifically designed for synchronous signal processing, while Pure is a general-purpose language which aims to facilitate symbolic processing of complicated data structures in a variety of application areas. Pure is based on the […]
Sep, 24

Dynamic Data Structures for Taskgraph Scheduling Policies with Applications in OpenCL Accelerators

OpenCL is an emerging open framework for parallel programming in heterogenous systems. OpenCL accelerators need to schedule the execution of submitted jobs with no (or only very imprecise) estimates of execution times, but respecting dependencies among them, which are given in the form of directed acyclic graph. This problem is known as stochastic taskgraph scheduling, […]
Sep, 24

CU2CL: A CUDA-to-OpenCL Translator for Multi-and Many-core Architectures

The use of graphics processing units (GPUs) in high-performance parallel computing continues to become more prevalent, often as part of a heterogeneous system. For years, CUDA has been the de facto programming environment for nearly all general-purpose GPU (GPGPU) applications. In spite of this, the framework is available only on NVIDIA GPUs, traditionally requiring reimplementation […]
Sep, 24

An MDE Approach for Automatic Code Generation from MARTE to OpenCL

Advanced engineering and scientific communities have used parallel programming to solve their large scale complex problems. Achieving high performance is the main advantage for this choice. However, as parallel programming requires a non-trivial distribution of tasks and data, developers find it hard to implement their applications effectively. Thus, in order to reduce design complexity, we […]
Sep, 23

Integrated Framework for Heterogeneous Embedded Platforms Using OpenCL

The technology community is rapidly moving away from the age of computers and laptops, and is entering the emerging era of hand-held devices. With the rapid development of smart phones, tablets, and pads, there has been widespread adoption of Graphic Processing Units (GPUs) in the embedded space. The hand-held market is now seeing an ever […]
Sep, 23

Embedding OpenCL in C++ for Expressive GPU Programming

We present a high performance GPU programming language, based on OpenCL, that is embedded in C++. Our embedding provides shared data structures, typesafe kernel invocation, and the ability to more naturally interleave CPU and GPU functions, similar to CUDA but with the portability of OpenCL. For expressivity, our language provides an abstraction that releases control […]
Sep, 23

Improving SIMT Efficiency of Global Rendering Algorithms with Architectural Support for Dynamic Micro-Kernels

Wide Single Instruction, Multiple Thread (SIMT)architectures often require a static allocation of thread groups that are executed in lockstep throughout the entire application kernel. Individual thread branching is supported by executing all control flow paths for threads in a thread group and only committing the results of threads on the current control path. While convergence […]
Sep, 23

Accelerating reaction-diffusion simulations with general-purpose graphics processing units

SUMMARY: We present a massively parallel stochastic simulation algorithm (SSA) for reaction-diffusion systems implemented on Graphics Processing Units (GPUs). These are designated chips optimized to process a high number of floating point operations in parallel, rendering them well-suited for a range of scientific high-performance computations. Newer GPU generations provide a high-level programming interface which turns […]
Sep, 23

Parallel processing on NVIDIA graphics processing units using CUDA

This paper is an introduction to general-purpose computing on graphics processing units. This involves taking advantage of the parallel processing power of modern graphics cards to do general purpose computation. The CUDA architecture used for general purpose computations on NVIDIA graphics cards is described, and important features affecting the run times of CUDA programs are […]
Sep, 23

Functional and dynamic programming in the design of parallel prefix networks

A parallel prefix network of width n takes n inputs, a_1, a_2, … , a_n, and computes each yi = a_1 o a_2 o … o a_i for 1 <= i <= n, for an associative operator o. This is one of the fundamental problems in computer science, because it gives insight into how parallel […]
Sep, 23

Image super-resolution by vectorizing edges

As the resolution of output device increases, the demand of high resolution contents has become more eagerly. Therefore, the image superresolution algorithms become more important. In digital image, the edges in the image are related to human perception heavily. Because of this, most recent research topics tend to enhance the image edges to achieve better […]
Sep, 23

Acceleration of Functional Validation Using GPGPU

Logic simulation of a VLSI chip is a computationally intensive process. There exists an urgent need to map functional validation algorithms onto parallel architectures to aid hardware designers in meeting time-to-market constraints. In this paper, we propose three novel methods for logic simulation of combinational circuits on GPGPUs. Initial experiments run on two methods using […]

* * *

* * *

HGPU group © 2010-2024 hgpu.org

All rights belong to the respective authors

Contact us: