Posts
Nov, 10
Parallel execution of a parameter sweep for molecular dynamics simulations in a hybrid GPU/CPU environment
Molecular Dynamics (MD) simulations can help to utimingnderstand an immense number of phenomena at the nano and microscale. They often require the exploration of large parameter space, and a possible parallelization strategy consists of sending different parameter sets to different processors. Here we present such approach using a hybrid environment of Graphic Processing Units (GPUs) […]
Nov, 10
Studies of quantum dots: Ab initio coupled-cluster analysis using OpenCL and GPU programming
Quantum dots, that is, strongly confined electrons, show a variety of interesting properties. Of relevance in both experiments and various technical components, is the possibility to fine tune their electrical and optical properties. Quantum dots can be manufactured by a number of different techniques in practice, but we have in this thesis employed computer simulations […]
Nov, 8
Architectural Considerations for Compiler-guided Unroll-and-Jam of CUDA Kernels
Hundreds of cores per chip and support for fine-grain multithreading have made GPUs a central player in todays HPC world. Much of the responsibility of achieving high performance on these complex systems lies with software like the compiler. This paper describes a compiler-based strategy for automatic and profitable application of the unroll-and-jam transformation to CUDA […]
Nov, 8
Flexible N-Way MIMO Detector on GPU
This paper proposes a flexible Multiple-Input Multiple-Output (MIMO) detector on graphics processing units (GPU). MIMO detection is a key technology in broadband wireless system such as LTE,WiMAX, and 802.11n. Existing detectors either use costly sorting for better performance or sacrifice sorting for higher throughput. To achieve good performance with high thoughput, our detector runs multiple […]
Nov, 8
Reusable OpenCL FPGA Infrastructure
OpenCL has emerged as a standard programming model for heterogeneous systems. Recent work combining OpenCL and FPGAs has focused on high-level synthesis. Building a complete OpenCL FPGA system requires more than just high-level synthesis. This work introduces a reusable OpenCL infrastructure for FPGAs that complements previous work and specifically targets a key architectural element – […]
Nov, 8
SPH Based Fluid Animation Using CUDA Enabled GPU
Realistic Fluid Animation is an inherent part of special effects in Film and Gaming Industry. These animations are created through the simulation of highly compute intensive fluid model. The computations involved in execution of fluid model emphasize the need of high performance parallel system to achieve the real time animation. This paper primarily devoted to […]
Nov, 8
Massively parallel Monte Carlo for many-particle simulations on GPUs
Current trends in parallel processors call for the design of efficient massively parallel algorithms for scientific computing. Parallel algorithms for Monte Carlo simulations of thermodynamic ensembles of particles have received little attention because of the inherent serial nature of the statistical sampling. In this paper, we present a massively parallel method that obeys detailed balance […]
Nov, 7
Efficient implementation of data flow graphs on multi-gpu clusters
Nowadays, it is possible to build a multi-GPU supercomputer, well suited for implementation of digital signal processing algorithms, for a few thousand dollars. However, to achieve the highest performance with this kind of architecture, the programmer has to focus on inter-processor communications, tasks synchronization. In this paper, we propose a high level programming model based […]
Nov, 7
Study of Convolution Algorithms using CPU and Graphics Hardware
In this thesis we evaluate different two-dimensional image convolution algorithms using Fast Fourier Transform (FFT) libraries on the CPU and on the graphics hardware, using Compute Unified Device Architecture (CUDA). The final product is used in VISSLA (VISualisation tool for Simulation of Light scattering and Aberrations), a software written in Matlab. VISSLA is used to […]
Nov, 7
A GPU-Based Transient Stability Simulation Using Runge-Kutta Integration Algorithm
Graphics processing units (GPU) have been investigated to release the computational capability in various scientific applications. Recent research shows that prudential consideration needs to be given to take the advantages of GPUs while avoiding the deficiency. In this paper, the impact of GPU acceleration to implicit integrators and explicit integrators in transient stability is investigated. […]
Nov, 7
GPU Virtualization
In modern computing, the Graphical Processing Unit (GPU) has proven its worth beyond that of graphics rendering. Its usage is extended into the field of general purpose computing, where applications exploit the GPU’s massive parallelism to accelerate their tasks. Meanwhile, Virtual Machines (VM) continue to provide utility and security by emulating entire computer hardware platforms […]
Nov, 7
Connectivity-Based Segmentation for GPU-Accelerated Mesh Decompression
We present a novel algorithm to partition large 3D meshes for GPU-accelerated decompression. Our formulation focuses on minimizing the replicated vertices between patches, and balancing the numbers of faces of patches for efficient parallel computing. First we generate a topology model of the original mesh and remove vertex positions. Then we assign the centers of […]