Posts
Jun, 3
GPU Join Processing Revisited
Until recently, the use of graphics processing units (GPUs) for query processing was limited by the amount of memory on the graphics card, a few gigabytes at best. Moreover, input tables had to be copied to GPU memory before they could be processed, and after computation was completed, query results had to be copied back […]
Jun, 3
Parallel Triangular Solvers on GPU
In this paper, we investigate GPU based parallel triangular solvers systematically. The parallel triangular solvers are fundamental to incomplete LU factorization family preconditioners and algebraic multigrid solvers. We develop a new matrix format suitable for GPU devices. Parallel lower triangular solver and upper triangular solver are developed for this new data structure. With these solvers, […]
Jun, 3
Parallel Agent systems on a GPU for use with Simulations and Games
In this paper we describe a parallel agent based computing system. The agents are placed on GPU memory and executed in parallel on the GPU. We discuss the difficulties in creating this system and provide solutions to each of the problems encountered. We then go on to describe a test bed application for the implementation […]
Jun, 3
Reservoir Simulation on NVIDIA Tesla GPUs
In this paper, we introduce our work on accelerating a black oil simulator using GPU-based parallel iterative linear solvers. We develop iterative linear solvers and several commonly used preconditioners on NVIDIA Tesla GPUs. These solvers and preconditioners are coupled with our in-house reservoir simulator. Numerical experiments show that our GPU-based black oil simulator is sped […]
Jun, 3
Ultra-Fast Displaying Spectral Domain Optical Doppler Tomography System Using a Graphics Processing Unit
We demonstrate an ultrafast displaying Spectral Domain Optical Doppler Tomography system using Graphics Processing Unit (GPU) computing. The calculation of FFT and the Doppler frequency shift is accelerated by the GPU. Our system can display processed OCT and ODT images simultaneously in real time at 120 fps for 1,024 pixels x 512 lateral A-scans. The […]
Jun, 1
Towards Distributed Heterogenous High-Performance Computing with ViennaCL
One of the major drawbacks of computing with graphics adapters is the limited available memory for relevant problem sizes. To overcome this limitation for the ViennaCL library, we investigate a partitioning approach for one of the standard benchmark problems in High-Performance Computing (HPC), namely the dense matrix-matrix product. We apply this partitioning approach to problems […]
Jun, 1
MPI-ACC: An Integrated and Extensible Approach to Data Movement in Accelerator-Based Systems
Data movement in high-performance computing systems accelerated by graphics processing units (GPUs) remains a challenging problem. Data communication in popular parallel programming models, such as the Message Passing Interface (MPI), is currently limited to the data stored in the CPU memory space. Auxiliary memory systems, such as GPU memory, are not integrated into such data […]
Jun, 1
A Data-Parallel Extension to Ruby for GPGPU
We propose Ikra, a data-parallel extension to Ruby for general-purpose computing on graphical processing unit (GPGPU). Our approach is to provide a special array class with higher-order methods for describing computation on a GPU. With a static type inference system that identifies code fragments that shall be executed on a GPU and with a skeleton-based […]
Jun, 1
Generating Device-specific GPU code for Local Operators in Medical Imaging
To cope with the complexity of programming GPU accelerators for medical imaging computations, we developed a framework to describe image processing kernels in a domainspecific language, which is embedded into C++. The description uses decoupled access/execute metadata, which allow the programmer to specify both execution constraints and memory access patterns of kernels. A source-to-source compiler […]
Jun, 1
An open source MATLAB program for fast numerical Feynman integral calculations for open quantum system dynamics on GPUs
This MATLAB program calculates the dynamics of the reduced density matrix of an open quantum system modeled by the Feynman-Vernon model. The user gives the program a vector describing the coordinate of an open quantum system, a hamiltonian matrix describing its energy, and a spectral distribution function and temperature describing the environment’s influence on it, […]
May, 30
clSpMV: A Cross-Platform OpenCL SpMV Framework on GPUs
Sparse matrix vector multiplication (SpMV) kernel is a key computation in linear algebra. Most iterative methods are composed of SpMV operations with BLAS1 updates. Therefore, researchers make extensive efforts to optimize the SpMV kernel in sparse linear algebra. With the appearance of OpenCL, a programming language that standardizes parallel programming across a wide variety of […]
May, 30
Large-scale Nanostructure Simulations from X-ray Scattering Data On Graphics Processor Clusters
X-ray scattering is a valuable tool for measuring the structural properties of materials used in the design and fabrication of energy-relevant nanodevices (e.g., photovoltaic, energy storage, battery, fuel, and carbon capture and sequestration devices) that are key to the reduction of carbon emissions. Although today’s ultra-fast X-ray scattering detectors can provide tremendous information on the […]