Posts
Mar, 9
GPU-RMAP: Accelerating Short-Read Mapping on Graphics Processors
Next-generation, high-throughput sequencers are now capable of producing hundreds of billions of short sequences (reads) in a single day. The task of accurately mapping the reads back to a reference genome is of particular importance because it is used in several other biological applications, e.g., genome re-sequencing, DNA methylation, and ChiP sequencing. On a personal […]
Mar, 9
Classical Simulation of Quantum Adiabatic Algorithms using Mathematica on GPUs
In this paper we present a simulation environment enhanced with parallel processing which can be used on personal computers, based on a high-level user interface developed on Mathematicacopyright which is connected to C++ code in order to make our platform capable of communicating with a Graphics Processing Unit. We introduce the reader to the behavior […]
Mar, 8
Using common graphics hardware for multi-agent traffic simulation with CUDA
Today’s graphics processing units (GPU) have tremendous resources when it comes to raw computing power. The simulation of large groups of agents in transport simulation has a huge demand of computation time. Therefore it seems reasonable to try to harvest this computing power for traffic simulation. Unfortunately simulating a network of traffic is inherently connected […]
Mar, 8
Fast heterogeneous computing with CUDA compatible Tesla GPU computing processor (personal supercomputing)
This paper presents how fast heterogeneous computing can be achieved with Tesla GPU computing processor. Tesla GPU super computer brings the performance of a cluster to a workstation and turning it into a supercomputer. We have chosen molecular dynamics field to show fast and high performance computing with Tesla GPU. We have given a DCS […]
Mar, 8
Performance and Scalability of GPU-Based Convolutional Neural Networks
In this paper we present the implementation of a framework for accelerating training and classification of arbitrary Convolutional Neural Networks (CNNs) on the GPU. CNNs are a derivative of standard Multilayer Perceptron (MLP) neural networks optimized for two-dimensional pattern recognition problems such as Optical Character Recognition (OCR) or face detection. We describe the basic parts […]
Mar, 8
A GPU-based finite-size pencil beam algorithm with 3D-density correction for radiotherapy dose calculation
Targeting at developing an accurate and efficient dose calculation engine for online adaptive radiotherapy, we have implemented a finite size pencil beam (FSPB) algorithm with a 3D-density correction method on GPU. This new GPU-based dose engine is built on our previously published ultrafast FSPB computational framework [Gu et al. Phys. Med. Biol. 54 6287-97, 2009]. […]
Mar, 8
General-purpose molecular dynamics simulations on GPU-based clusters
We present a GPU implementation of LAMMPS, a widely-used parallel molecular dynamics (MD) software package, and show 5x to 13x single node speedups versus the CPU-only version of LAMMPS. This new CUDA package for LAMMPS also enables multi-GPU simulation on hybrid heterogeneous clusters, using MPI for inter-node communication, CUDA kernels on the GPU for all […]
Mar, 8
Porting of an Edge-Based CFD Solver to GPUs
Graphics processing units (GPUs) are increasingly becoming a mainstream platform for high performance computational fluid dynamics. This paper describes the porting of a substantial portion of FEFLO, an adaptive, edge-based finite element code for the solution of compressible and incompressible flow, to run on GPUs. The code is primarily written in Fortran 77 and has […]
Mar, 8
Accelerating H.264 inter prediction in a GPU by using CUDA
H.264/AVC defines a very efficient algorithm for the inter prediction but it takes too much time. With the emergence of general purpose graphics processing units (GPGPU), a new door has been opened to support this video algorithm into these small processing units. In this paper, a forward step is developed towards an implementation of the […]
Mar, 8
Offloading Region Matching of Data Distribution Management with CUDA
Data distribution management (DDM) aims to reduce the transmission of irrelevant data between High Level Architecture (HLA) compliant simulators by taking their interesting regions into account (i.e. region matching). In a large-scale simulation, computation intensive region matching would have a direct impact on the simulation performance. To deal with the high computation cost of region […]
Mar, 8
Preliminary implementation of VQ image coding using GPGPU
GPGPU (general purpose computing on graphic processing unit) attracts a great deal of attention, that is used for general-purpose computations like numerical calculations as well as graphic processing. In this paper, as an example of hierarchical clustering algorithms, we evaluate PNN (pairwise nearest neighbor) on GPUs by using CUDA (compute unified device architecture). We also […]