6316

Posts

Nov, 12

Safe Asynchronous Multicore Memory Operations

Asynchronous memory operations provide a means for coping with the memory wall problem in multicore processors, and are available in many platforms and languages, e.g., the Cell Broadband Engine, CUDA and OpenCL. Reasoning about the correct usage of such operations involves complex analysis of memory accesses to check for races. We present a method and […]
Nov, 11

Performance Analysis and Benchmarking of the Intel SCC

There has been a continuous change over the past years in CPU design and development towards both power-aware hardware architectures as well as many-core processors. The Intel Single-chip Cloud Computer (SCC) combines those two trends. It is an experimental prototype created by Intel Labs consisting of 48 Pentium cores. The SCC is a highly configurable […]
Nov, 11

Building a Real-Time Multi-GPU Platform: Robust Real-Time Interrupt Handling Despite Closed-Source Drivers

Architectures in which multicore chips are augmented with graphics processing units (GPUs) have great potential in many domains in which computationally intensive real-time workloads must be supported. However, unlike standard CPUs, GPUs are treated as I/O devices and require the use of interrupts to facilitate communication with CPUs. Given their disruptive nature, interrupts must be […]
Nov, 11

Many-body quantum chemistry on graphics processing units

Heterogeneous nodes composed of a multicore CPU and at least one graphics processing unit (GPU) are increasingly common in high-performance scientific computing, and significant programming effort is currently being undertaken to port existing scientific algorithms to these unique architectures. We present implementations for two many-body quantum chemistry methods on heterogeneous nodes: the coupled-cluster with single […]
Nov, 11

Accelerating the Smoldyn Spatial Stochastic Biochemical Reaction Network Simulator Using GPUs

Smoldyn is a spatio-temporal biochemical reaction network simulator. It belongs to a class of methods called particle-based methods and is capable of handling effects such as molecular crowding. Individual molecules are modelled as point objects that can diffuse and react in a control volume. Since each molecule has to be simulated individually, the computational complexity […]
Nov, 11

Improving GPU Performance via Large Warps and Two-Level Warp Scheduling

Due to their massive computational power, graphics processing units (GPUs) have become a popular platform for executing general purpose parallel applications. GPU programming models allow the programmer to create thousands of threads, each executing the same computing kernel. GPUs exploit this parallelism in two ways. First, threads are grouped into fixed-size SIMD batches known as […]
Nov, 11

An Interest Point Based Illumination Condition Matching Approach to Photometric Registration Within Augmented Reality Worlds

With recent and continued increases in computing power, and advances in the field of computer graphics, realistic augmented reality environments can now offer inexpensive and powerful solutions in a whole range of training, simulation and leisure applications. One key challenge to maintaining convincing augmentation, and therefore user immersion, is ensuring consistent illumination conditions between virtual […]
Nov, 11

Synthetic Aperture Beamformation using the GPU

A synthetic aperture ultrasound beamformer is implemented for a GPU using the OpenCL framework. The implementation supports beamformation of either RF signals or complex baseband signals. Transmit and receive apodization can be either parametric or dynamic using a fixed F-number, a reference, and a direction. Images can be formed using an arbitrary number of emissions […]
Nov, 10

Run, Stencil, Run! – A Comparison of Modern Parallel Programming Paradigms

While the performance of supercomputers has increased dramatically during the last 15 years, programming models and programming languages have more or less remained constant. Two de facto standards, the Message Passing Interface (MPI) for programming distributed memory architectures and OpenMP for programming shared-memory architectures still dominate the field of computational science and engineering. As current […]
Nov, 10

Generation of planar radiographs from 3D anatomical models using the GPU

The rapid growth of the number of transistors on integrated circuits has enabled numerous advances in computational hardware. Computer graphics development benefited from these advances, reaching a stage where they deliver realistic and rich user experience through amazing graphics. GPUs are nowadays capable of processing massive amounts of data, by taking advantage of its intrinsic […]
Nov, 10

cudaBayesreg: Parallel Implementation of a Bayesian Multilevel Model for fMRI Data Analysis

Graphic processing units (GPUs) are rapidly gaining maturity as powerful general parallel computing devices. A key feature in the development of modern GPUs has been the advancement of the programming model and programming tools. Compute Unified Device Architecture (CUDA) is a software platform for massively parallel high-performance computing on Nvidia many-core GPUs. In functional magnetic […]
Nov, 10

Fast Hair Simulation and Rendering Using CUDA and OpenGL

Realistically modeling and animating human hair is an open challenge in computer graphics. Human hair is geometrically complex both from very large numbers of strands of hair on a head plus small microscopic variations on a single strand which contribute to its unique material properties. Additionally, due to complex interaction among all the hairs it […]

* * *

* * *

HGPU group © 2010-2025 hgpu.org

All rights belong to the respective authors

Contact us: