8011

Posts

Jul, 18

Robust GPGPU plugin development for RapidMiner

In recent years, significant number of papers [1][2] have been published about general-purpose graphical processing unit (GPGPU) programs which are able to accelerate computationally intensive applications by several times over conventional CPU programs. These papers raise an important question: With the current developer tools is it possible to integrate these GPU programs into a major […]
Jul, 18

Temporal Blending for Adaptive SPH

In this paper we introduce a fast and consistent Smoothed Particle Hydrodynamics (SPH) technique which is suitable for convection-diffusion simulations of incompressible fluids. We apply our temporal blending technique to reduce the number of particles in the simulation while smoothly changing quantity fields. Our approach greatly reduces the error introduced in the pressure term when […]
Jul, 18

Fluid Simulation on Surfaces in the GPU

In this paper we present a method to simulate fluids on smooth surfaces of arbitrary topology using a graphics processing unit (GPU). To do this we use the parametrization of Catmull-Clark subdivision surfaces, and obtain the metric information of the distortion caused by this parametrization, so we can calculate differential operators of functions defined on […]
Jul, 17

CuNesl: Compiling Nested Data-Parallel Languages for SIMT Architectures

Data-parallel languages feature fine-grained parallel primitives that can be supported by compilers targeting modern many-core architectures where data parallelism must be exploited to fully utilize the hardware. Previous research has focused on converting data-parallel languages for SIMD (single instruction multiple data) architectures. However, directly applying them to today’s SIMT (single instruction multiple thread) architectures does […]
Jul, 17

Interactively Simulating Fluid based on SPH and CUDA

In this paper, we propose a novel method of interactive fluid simulating based on SPH, and implement it on CUDA (Compute Unified Device Architecture). Firstly we use SPH (Smoothed Particle Hydrodynamics) theory to simulate the motion of fluids. Secondly we propose an interactive method between fluid and rigid objects. We treat the rigid objects as […]
Jul, 17

CUSIMANN: An optimized simulated annealing software for GPUs

CUSIMANN (CUDA SIMULATED ANNEALING) is a free/open-source library for global optimization that provides a parallel implementation of the simulated annealing algorithm in CUDA.
Jul, 17

Scalable Molecular Dynamics Simulation Using FPGAs and Multicore Processors

While Molecular Dynamics Simulation (MD) uses a large fraction of the world’s High Performance Compute cycles, the modeling of many physical phenomena remains far out of reach. Improving the cost-effectiveness of MD has therefore received much attention, especially in using accelerators or modifying the computation itself. While both approaches have demonstrated great potential, scalability has […]
Jul, 17

Multicore and Manycore Algorithms for Octrees

Octrees and compressed octrees are frequently used to represent data in an hierarchical form for high performance computing, graphics and database applications. Applications like N-body problems require building octrees multiple times. Therefore, efficient construction of octrees is critical to the efficiency of the entire applications. With ever increasing data size, there is a requirement to […]
Jul, 16

Optimizing MapReduce for GPUs with effective shared memory usage

Accelerators and heterogeneous architectures in general, and GPUs in particular, have recently emerged as major players in high performance computing. For many classes of applications, MapReduce has emerged as the framework for easing parallel programming and improving programmer productivity. There have already been several efforts on implementing MapReduce on GPUs. In this paper, we propose […]
Jul, 16

Sparse Matrix-Vector Multiplication on NVIDIA GPU

In this paper, we present our work on developing a new matrix format and a new sparse matrix-vector multiplication algorithm. The matrix format is HEC, which is a hybrid format. This matrix format is efficient for sparse matrix-vector multiplication and is friendly to preconditioner. Numerical experiments show that our sparse matrix-vector multiplication algorithm is efficient […]
Jul, 16

Sparse Matrix Matrix Multiplication on Hybrid CPU+GPU Platforms

Sparse matrix-sparse/dense matrix multiplications, spgemm and csrmm, among other applications find usage in various matrix formulations of graph problems. GPU based supercomputers are presently experiencing severe performance issues on the Graph-500 benchmarks, a new HPC benchmark suite focusing on graph algorithms. Considering the difficulties in executing graph problems and the duality between graphs and matrices, […]
Jul, 16

A Yoke of Oxen and a Thousand Chickens for Heavy Lifting Graph Processing

Large, real-world graphs are famously difficult to process efficiently. Not only they have a large memory footprint but most graph processing algorithms entail memory access patterns with poor locality, data-dependent parallelism, and a low compute-to-memory access ratio. Additionally, most real-world graphs have a low diameter and a highly heterogeneous node degree distribution. Partitioning these graphs […]

* * *

* * *

HGPU group © 2010-2025 hgpu.org

All rights belong to the respective authors

Contact us:

contact@hpgu.org