high performance computing on graphics processing units: hgpu.org

Posts

Feb, 24

Solving the Euler Equations on Graphics Processing Units

The paper describes how one can use commodity graphics cards (GPUs) as a high-performance parallel computer to simulate the dynamics of ideal gases in two and three spatial dimensions. The dynamics is described by the Euler equations, and numerical approximations are computed using state-of-the-art high-resolution finite-volume schemes. These schemes are based upon an explicit time […]

Feb, 24

Barra, a Modular Functional GPU Simulator for GPGPU

The use of GPUs for general-purpose applications promises huge performance returns for a small investment. However the internal design of such processors is undocumented and many details are unknown, preventing developers to optimize their code for these architectures. One solution is to use functional simulation to determine program behavior and gather statistics when counters are […]

CUDA

Feb, 24

An MPI-CUDA Implementation for Massively Parallel Incompressible Flow Computations on Multi-GPU Clusters

Modern graphics processing units (GPUs) with many-core architectures have emerged as general-purpose parallel computing platforms that can accelerate simulation science applications tremendously. While multiGPU workstations with several TeraFLOPS of peak computing power are available to accelerate computational problems, larger problems require even more resources. Conventional clusters of central processing units (CPU) are now being augmented […]

CUDA

Feb, 24

An efficient GPU-based approach for interactive global illumination

This paper presents a GPU-based method for interactive global illumination that integrates complex effects such as multi-bounce indirect lighting, glossy reflections, caustics, and arbitrary specular paths. Our method builds upon scattered data sampling and interpolation on the GPU. We start with raytraced shading points and partition them into coherent shading clusters using adaptive seeding followed […]

Feb, 24

Algebraic 3D Reconstruction of Planetary Nebulae

Distant astrophysical objects like planetary nebulae can normally only be observed from a single point of view. Assuming a cylindrically symmetric geometry, one can nevertheless create 3D models of those objects using tomographic methods. We solve the resulting algebraic equations efficiently on graphics hardware. Small deviations from axial symmetry are then corrected using heuristic methods, […]

CUDA

Feb, 24

Cloth Simulation on the GPU

Building on the trend of offloading more tasks from the CPU to the GPU to take advantage of the GPU’s steeper performance curve, we propose a new method to simulate cloth on any GPU supporting Shader Model 3. Our implementation is geared toward performance and visual realism, rather than physical accuracy. As such, it’s more […]

OpenGL

Feb, 24

Accelerating Quantum Monte Carlo Simulations with Emerging Architectures

Scientific computing applications demand ever-increasing performance while traditional microprocessor architectures face limits. Recent technological advances have led to a number of emerging computing platforms that provide one or more of the following over their predecessors: increased energy efficiency, programmability/flexibility, different granularities of parallelism, and higher numerical precision support. This dissertation explores emerging platforms such as […]

CUDA

Feb, 24

A Resource-Efficient Computing Paradigm for Computational Protein Modeling Applications

Many computational protein modeling applications using numerical methods such as Molecular Dynamics (MD), Monte Carlo (MC), or Genetic Algorithms (GA) require a large number of energy estimations of the protein molecular system. A typical energy function describing the protein energy is a combination of a number of terms characterizing various interactions within the protein molecule […]

Feb, 24

Parallel Cloth Simulation Using OpenMP and CUDA

The widespread availability of parallel computing architectures has lead to research regarding algorithms and techniques that best exploit available parallelism. In addition to the CPU parallelism available; the GPU has emerged as a parallel computational device. The goal of this study was to explore the combined use of CPU and GPU parallelism by developing a […]

CUDA

Feb, 24

Many-core GPU computing with NVIDIA CUDA

In the past, graphics processors were special-purpose hardwired application accelerators, suitable only for conventional graphics applications. Modern GPUs are fully programmable, massively parallel floating point processors. In this talk I will describe NVIDIA’s scalable, highly parallel many-core GPU architecture and how CUDA software for GPU computing delivers high throughput for data-intensive processing. I will discuss […]

CUDA

Feb, 24

Iterative GPGPU Linear Solvers for Sparse Matrices

The performance and the level of programmability of graphics processors (GPU) on current video cards offer new capabilities beyond the graphics applications for which they were designed. These are general-purpose computations which expose parallelism. In this thesis, I describe the iterative methods for solving sparse linear systems: the Jacobi, Gauss-Seidel, Conjugate Gradient and BiConjugate Gradient […]

Feb, 24

Fast Implementation of Two Hash Algorithms on nVidia CUDA GPU

User needs increases as time passes. We started with computers like the size of a room where the perforated plaques did the same function as the current machine code object does and at present we are at a point where the number of processors within our graphic device unit it’s not enough for our requirements. […]

CUDA

high performance computing on graphics processing units: hgpu.org

Posts

Solving the Euler Equations on Graphics Processing Units

Barra, a Modular Functional GPU Simulator for GPGPU

An MPI-CUDA Implementation for Massively Parallel Incompressible Flow Computations on Multi-GPU Clusters

An efficient GPU-based approach for interactive global illumination

Algebraic 3D Reconstruction of Planetary Nebulae

Cloth Simulation on the GPU

Accelerating Quantum Monte Carlo Simulations with Emerging Architectures

A Resource-Efficient Computing Paradigm for Computational Protein Modeling Applications

Parallel Cloth Simulation Using OpenMP and CUDA

Many-core GPU computing with NVIDIA CUDA

Iterative GPGPU Linear Solvers for Sparse Matrices

Fast Implementation of Two Hash Algorithms on nVidia CUDA GPU

Recent source codes

CudaForge: An Agent Framework with Hardware Feedback for CUDA Kernel Optimization

LC Framework

pplx-garden: Perplexity open source garden for inference technology

Atlas CLI: Machine Learning (ML) Lifecycle & Transparency Manager

transformers_tvm: Implementation of Encoder Decoder transformer on TVM

OpScanner

INT v.s. FP: A framework to compare low-bit integer and float-point formats

AutoDock-GPU: AutoDock for GPUs and other accelerators

NCCLX: collective communication framework

Tutoring LLM into a Better CUDA Optimizer

Most viewed papers (last 30 days)