5953

Posts

Oct, 12

Implementing modular arithmetic using OpenCL

Problem description: Most public key algorithms are based on modular arithmetic. The simplest, and original, implementation of the protocol uses the multiplicative group of integers modulo p, where p is prime and g is primitive root mod p. This is the way Diffie-Hellman is implemented. RSA is implemented in a similar way c=me mod p*q. […]
Oct, 12

General purpose computing on graphics processing units using OpenCL

General-Purpose computing using Graphics Processing Units (GPGPU) has been an area of active research for many years. During 2009 and 2010 much has happened in the GPGPU research field with the release of the Open Computing Language (OpenCL) programming framework and the new NVIDIA Fermi Graphics Processing Unit (GPU) architecture. This thesis explores the hardware […]
Oct, 12

Distance Fields Accelerated with OpenCL

An important task in any graphical simulation is the collision detection between the objects in the simulation. It is desirable to have a good general method for collision detection with high performance. This thesis describes an implementation of a collision detection method that uses distance fields to detect collisions. This method is quite robust and […]
Oct, 12

Cinematic Particle Systems with OpenCL

High-particle-count simulations are becoming increasingly crucial in many different aspects of our world today: both in entertainment – within video games, movies, and the like – and in scientific fields, where particle systems are capable of simulating and visualizing many interesting phenomena. This paper will explore the possibility of parallelizing the simulation of these large […]
Oct, 12

Color Correction Acceleration Using a Color Cube and OpenCL

The article deals with the problem of real time color correction on modern but not dedicated video hardware, suggesting a new implementation of fast algorithm for color transformation utilizing 3D look-up tables. We focus on highly parallel nature of the proposed method and employ the GPU to perform the color calculations side-byside. The paper is […]
Oct, 12

Evaluating performance and portability of OpenCL programs

Recently, OpenCL, a new open programming standard for GPGPU programming, has become available in addition to CUDA. OpenCL can support various compute devices due to its higher abstraction programming framework. Since there is a semantic gap between OpenCL and compute devices, the OpenCL C compiler plays important roles to exploit the potential of compute devices […]
Oct, 11

Real-Time Rigid Body Interactions

Rigid body simulations are useful in many areas, most notably video games and computer animation. However, the requirements for accuracy and performance vary greatly between applications. In this project we combine methods and techniques from different sources to implement a rigid body simulation. The simulation uses a particle representation to approximate objects with the intent […]
Oct, 11

Performance Characterization and Optimization of Atomic Operations on AMD GPUs

Atomic operations are important building blocks in supporting general-purpose computing on graphics processing units (GPUs). For instance, they can be used to coordinate execution between concurrent threads, and in turn, assist in constructing complex data structures such as hash tables or implementing GPU-wide barrier synchronization. While the performance of atomic operations has improved substantially on […]
Oct, 11

Performance and Power Analysis of ATI GPU: A Statistical Approach

We present a comprehensive study on the performance and power consumption of a recent ATI GPU. By employing a rigorous statistical model to analyze execution behaviors of representative general-purpose GPU (GPGPU) applications, we conduct insightful investigations on the target GPU architecture. Our results demonstrate that the GPU execution throughput and the power dissipation are dependent […]
Oct, 11

Fast Surface Extraction and Visualization of Medical Images using OpenCL and GPUs

Marching Cubes (MC) is an algorithm that extracts surfaces from volumetric data. It is used extensively in visualization and analysis of medical data from modalities like CT and MR, often after a 3D segmentation of the interesting structures is performed. Traditional implementations of MC on modern CPUs are slow, using several seconds (even minutes) to […]
Oct, 11

On the Efficacy of a Fused CPU+GPU Processor (or APU) for Parallel Computing

The graphics processing unit (GPU) has made significant strides as an accelerator in parallel computing. However, because the GPU has resided out on PCIe as a discrete device, the performance of GPU applications can be bottlenecked by data transfers between the CPU and GPU over PCIe. Emerging heterogeneous computing architectures that "fuse" the functionality of […]
Oct, 11

PyPs, a programmable pass manager

As hardware platforms are growing in complexity, compiler infrastructures need more flexibility: due to the heterogeneity of these platforms, compiler phases must be combined in unusual and dynamic ways, and several tools may need to be combined to handle specific parts of the compilation process efficiently. The need for flexibility also appears in iterative compilation […]

* * *

* * *

HGPU group © 2010-2024 hgpu.org

All rights belong to the respective authors

Contact us: