9641

Posts

Jun, 7

Scientific Computing on Hybrid Architectures

Modern computer architectures, with multicore CPUs and GPUs or other accelerators, make stronger demands than ever on writers of scientific code. Normally, the most efficient program has to be written – using a substantial effort – by expert programmers for a certain application on a particular computer. This thesis deals with several algorithmic and technical […]
Jun, 7

CUDA Based Performance Evaluation of the Computational Efficiency of the DCT Image Compression Technique on Both the CPU and GPU

Recent advances in computing such as the massively parallel GPUs (Graphical Processing Units),coupled with the need to store and deliver large quantities of digital data especially images, has brought a number of challenges for Computer Scientists, the research community and other stakeholders. These challenges, such as prohibitively large costs to manipulate the digital data amongst […]
Jun, 6

Parallel Implementation of Finite Element Codes using CUDA

The purpose of this work is to study the performance of parallel computation of Finite Element Method using the NVIDIA’s CUDA. The numerical experiments are performed only on the stiffness matrix using the conjugate gradient method. In addition, the generalized minimal residual method is considered to solve the Stokes problem using both PETSc and CUDA. […]
Jun, 6

Development of an explicit pressure-based unstructured solver for three-dimensional incompressible flows with graphics hardware acceleration

In this research, a numerical algorithm was developed to solve the incompressible Navier-Stokes equations using explicit time stepping. The goal of this research was to develop an unsteady SIMPLER based algorithm with lower computational overhead. The new explicit algorithm uses a four stage Runge-Kutta scheme to update the velocities and eliminates the need for the […]
Jun, 6

Parallelization & checkpointing of GPU applications through program transformation

GPUs have emerged as a powerful tool for accelerating general-purpose applications. The availability of programming languages that makes writing general-purpose applications for running on GPUs tractable have consolidated GPUs as an alternative for accelerating general-purpose applications. Among the areas that have benefited from GPU acceleration are: signal and image processing, computational fluid dynamics, quantum chemistry, […]
Jun, 6

A MapReduce Framework for Heterogeneous Computing Architectures

Nowadays, an increasing number of computational systems are equipped with heterogeneous compute resources, i.e., following different architecture. This applies to the level of a single chip, a single node and even supercomputers and large-scale clusters. With its impressive price-to-performance ratio as well as power efficiency compared to traditional multicore processors, graphics processing units (GPUs) has […]
Jun, 6

A comprehensive study of Dynamic Memory Management in OpenCL kernels

Traditional (sequential) applications use malloc for a variety of dynamic data structures, like linked lists or trees. GPGPU is gaining attention and popularity because its massively-parallel architecture allows for great speed improvement for programs that can be parallelised and implemented for a platform like OpenCL. Programmers who try to port their existing sequential or even […]
Jun, 6

A Reliable Throughput Gain on GPUs

Graphic Processing Units (GPUs) are widely employed in many applications in which high computing capabilities are required and parallelism can be fruitfully exploited. A higher amount of parallel threads bring to the GPU a higher throughput, but may also increase the code neutron-induced error rate. The GPUs sensitivity depends not only on the code throughput, […]
Jun, 6

Automating elimination of idle functions by run-time reconfiguration

A design approach is proposed to automatically identify and exploit run-time reconfiguration opportunities while optimising resource utilisation. We introduce Reconfiguration Data Flow Graph, a hierarchical graph structure enabling reconfigurable designs to be synthesised in three steps: function analysis, configuration organisation, and run-time solution generation. Three applications, based on barrier option pricing, particle filter, and reverse […]
Jun, 6

Implicit Skinning: Real-Time Skin Deformation with Contact Modeling

Geometric skinning techniques, such as smooth blending or dualquaternions, are very popular in the industry for their high performances, but fail to mimic realistic deformations. Other methods make use of physical simulation or control volume to better capture the skin behavior, yet they cannot deliver real-time feedback. In this paper, we present the first purely […]
Jun, 6

ElastiFace: Matching and Blending Textured Faces

In this paper we present ELASTIFACE, a simple and versatile method for establishing correspondence between textured face models, either for the construction of a blend-shape facial rig or for the exploration of new characters by morphing between a set of input models. While there exists a wide variety of approaches for inter-surface mapping and mesh […]
Jun, 6

Accelerating Fast Fourier Transform for Wideband Channelization

Wideband channelization is a compute-intensive task with performance requirements that are arguably greater than what current multi-core CPUs can provide. To date, researchers have used dedicated hardware such as field programmable gate arrays (FPGAs) to address the performancecritical aspects of the channelizer. In this work, we assess the viability of the graphics processing unit (GPU) […]

Recent source codes

* * *

* * *

HGPU group © 2010-2025 hgpu.org

All rights belong to the respective authors

Contact us:

contact@hpgu.org