3366

Posts

Mar, 17

CUDA Compatible GPU as an Efficient Hardware Accelerator for AES Cryptography

This paper presents a study of the efficiency in applying modern Graphics Processing Units in symmetric key cryptographic solutions. It describes both traditional style approaches based on the OpenGL graphics API and new ones based on the recent technology trends of major hardware vendors. It presents an efficient implementation of the Advanced Encryption Standard (AES) […]
Mar, 17

High-Throughput Transaction Executions on Graphics Processors

OLTP (On-Line Transaction Processing) is an important business system sector in various traditional and emerging online services. Due to the increasing number of users, OLTP systems require high throughput for executing tens of thousands of transactions in a short time period. Encouraged by the recent success of GPGPU (General-Purpose computation on Graphics Processors), we propose […]
Mar, 16

Mutual information computation and maximization using GPU

We present a GPU implementation to compute both mutual information and its derivatives. Mutual information computation is a highly demanding process due to the enormous number of exponential computations. It is therefore the bottleneck in many image registration applications. However, we show that these computations are fully parallizable and can be efficiently ported onto the […]
Mar, 16

Direct evaluation of NURBS curves and surfaces on the GPU

This paper presents a new method to evaluate and display trimmed NURBS surfaces using the Graphics Processing Unit (GPU). Trimmed NURBS surfaces, the de facto standard in commercial 3D CAD modeling packages, are currently tessellated into triangles before being sent to the graphics card for display since there is no native hardware support for NURBS. […]
Mar, 16

Alignment invariant image comparison implemented on the GPU

This paper proposes a GPU implemented algorithm to determine the differences between two binary images using Distance Transformations. These differences are invariant to slight rotation and offsets, making the technique ideal for comparisons between images that are not perfectly aligned. The parallel processing capabilities of the GPU allows for faster implementation than on traditional desktop […]
Mar, 16

Implementing the Himeno benchmark with CUDA on GPU clusters

This paper describes the use of CUDA to accelerate the Himeno benchmark on clusters with GPUs. The implementation is designed to optimize memory bandwidth utilization. Our approach achieves over 83% of the theoretical peak bandwidth on a NVIDIA Tesla C1060 GPU and performs at over 50 GFlops. A multi-GPU implementation that utilizes MPI alongside CUDA […]
Mar, 16

GPU-Based Techniques For Global Illumination Effects

This book presents techniques to render photo-realistic images by programming the Graphics Processing Unit (GPU). We discuss effects such as mirror reflections, refractions, caustics, diffuse or glossy indirect illumination, radiosity, single or multiple scattering in participating media, tone reproduction, glow, and depth of field. The book targets game developers, graphics programmers, and also students with […]
Mar, 16

Rotationally invariant sparse patch matching on GPU and FPGA

Vector and data-flow processors are particularly strong at dense, regular computation. Sparse, irregular data layouts cause problems because their unpredictable data access patterns prevent computational pipelines from filling effectively. A number of algorithms in image processing have been proposed which are not dense, and instead apply local neighborhood operations to a sparse, irregular set of […]
Mar, 16

GPU based acceleration of first principles calculation

We present a Graphics Processing Unit (GPU) accelerated simulations of first principles electronic structure calculations. The FFT, which is the most time-consuming part, is about 10 times accelerated. As the result, the total computation time of a first principles calculation is reduced to 15 percent of that of the CPU.
Mar, 16

GPU-Based Volume Rendering for Medical Imagery

During the quick advancements of medical image visualization and augmented virtual reality application, the low performance of the volume rendering algorithm is still a “bottle neck”. To facilitate the usage of well developed hardware resource, a novel graphics processing unit (GPU)-based volume ray-casting algorithm is proposed in this paper. Running on a normal PC, it […]
Mar, 16

GPU-based frequency domain volume rendering

Frequency domain volume rendering (FVR) is a volume rendering technique with lower computational complexity as compared to other techniques. In this paper the FVR algorithm is accelerated by factor of 17 by mapping the rendering stage to the GPU. The overall hardware-accelerated pipeline is discussed and the changes according to previous work are pointed out. […]
Mar, 16

Patterns of Inefficient Performance Behavior in GPU Applications

Writing efficient software for heterogeneous architectures equipped with modern accelerator devices presents a serious challenge to programmer productivity, creating a need for powerful performance-analysis tools to adequately support the software development process. To guide the design of such tools, we describe typical patterns of inefficient runtime behavior that may adversely affect the performance of applications […]

* * *

* * *

HGPU group © 2010-2025 hgpu.org

All rights belong to the respective authors

Contact us: