5970

Posts

Oct, 14

A Heterogeneous Parallel Framework for Domain-Specific Languages

Computing systems are becoming increasingly parallel and heterogeneous, and therefore new applications must be capable of exploiting parallelism in order to continue achieving high performance. However, targeting these emerging devices often requires using multiple disparate programming models and making decisions that can limit forward scalability. In previous work we proposed the use of domain-specific languages […]
Oct, 14

Fast Multipole Method vs. Spectral Method for the Simulation of Isotropic Turbulence on GPUs

This paper presents calculations of homogeneous isotropic turbulence at Re_{lambda} = 100 using both a pseudo-spectral method and a fast multipole vortex method on a 256^3 grid. For the vortex method, both algorithmic and hardware acceleration are applied using a highly parallel fast multipole method (FMM) on GPUs. The spectral methods uses the FFTW library […]
Oct, 13

Benchmarking Across Platforms: European Option Pricing

Using a popular Monte Carlo workload which implements European option pricing, we tested a variety of architectures including NVIDIA and AMD GPUs, ClearSpeed accelerator and multi-core processors and different programming approaches. We conclude that this particular workload seems most suitable for running on GPU type of architecture compared to other alternatives such as CPU or […]
Oct, 13

Firepile: Run-time Compilation for GPUs in Scala

Recent advances have enabled GPUs to be used as general-purpose parallel processors on commodity hardware for little cost. However, the ability to program these devices has not kept up with their performance. The programming model for GPUs has a number of restrictions that make it dif?cult to program. For example, software running on the GPU […]
Oct, 13

A rendering method for simulated emission nebulae

Emission nebulae are some of the most beautiful stellar phenomena. The newly formed hot stars inside the nebulae ionize the surrounding gas making it glow in variety of colors. The focus of this work is to find a method for interactive rendering of simulated emission nebulae. A rendering program has been developed to render and […]
Oct, 13

Introduction to GPU Radix Sort

Radix sort is one of the fastest sorting algorithms. It is fast especially for a large problem size. Radix sort is not a comparison sort but a counting sort. When we sort n bit keys, 2^n counters are prepared for each number.
Oct, 13

Input Sensitivity of GPU Program Optimizations

Graphic Processing Units (GPU) have become increasingly adopted for the enhancement of computing throughput. However, the development of a high-quality GPU application is challenging, due to the large optimization space and complex unpredictable effects of optimizations on GPU program performance. Many recent efforts have been employing empirical search-based auto-tuners to tackle the problem, but few […]
Oct, 13

gpustats: GPU Library for Statistical Computing in Python

In this talk we will discuss gpustats, a new Python library for assisting in "big data" statistical computing applications, particularly Monte Carlobased inference algorithms. The library provides a general code generation / metaprogramming framework for easily implementing discrete and continuous probability density functions and random variable samplers. These functions can be utilized to achieve more […]
Oct, 13

Seamless Dynamic Runtime Reconfiguration in a Software-Defined Radio

We discuss implementation aspects of a software-defined radio system that allows for dynamic waveform reconfiguration during runtime without interrupting dataflow processing. Traditional software-defined radio systems execute a waveform statically, exactly as it is programmed. Reconfiguration is provided by executing a different waveform, which requires the system to stop processing data while reconfiguration occurs, and also […]
Oct, 13

Developing a High Performance GPGPU Compiler Using Cetus

In this paper we present our experience in developing an optimizing compiler for general purpose computation on graphics processing units (GPGPU) based on the Cetus compiler framework. The input to our compiler is a naive GPU kernel procedure, which is functionally correct but without any consideration for performance optimization. Our compiler applies a set of […]
Oct, 13

PlinkGPU: A Framework for GPU Acceleration of Whole Genome Data Analysis

Genome-wide association studies (GWAS) are performed in order to detect the genetic variations associated with physical traits (e.g. diseases), and Plink is a popular software system for analyzing the data of GWAS. Due to the large datasets involved, the task of data processing can be very time-consuming. Although GPUs (graphics processing units) are not generally […]
Oct, 13

State of The Art Report on GPU

This report aims to provide a beginner’s introduction to GPUs, from both a hardware and a software angle. We look at the evolution of specialist graphics hardware from the early days of PC graphics cards to the present day. We describe the currently available hardware from NVIDIA and AMD/ATI, and the current software from both […]

* * *

* * *

HGPU group © 2010-2025 hgpu.org

All rights belong to the respective authors

Contact us: