high performance computing on graphics processing units: hgpu.org

Posts

Jun, 14

Accelerating Dynamic Time Warping Subsequence Search with GPUs and FPGAs

Many time series data mining problems require subsequence similarity search as a subroutine. Dozens of similarity/distance measures have been proposed in the last decade and there is increasing evidence that Dynamic Time Warping (DTW) is the best measure across a wide range of domains. Given DTW’s usefulness and ubiquity, there has been a large community-wide […]

CUDA

Jun, 14

Memory-efficient implementation of a graphics processor-based cluster detection algorithm for large spatial databases

Numerous approaches have been proposed for detecting clusters, groups of data in spatial databases. Of these, the algorithm known as Density Based Spatial Clustering of Applications with Noise (DBSCAN) is a recent approach which has proven efficient for larger databases. Graphical Processing Units (GPUs), used originally to aid in the processing of high intensity graphics, […]

Jun, 14

Parallel implementation of artificial neural network training

In this paper we describe the implementation of a complete ANN training procedure for speech recognition using the block mode back-propagation learning algorithm. We exploit the high performance SIMD architecture of GPU using CUDA and its C-like language interface. We also compare the speed-up obtained implementing the training procedure only taking advantage of the multi-thread […]

CUDA

Jun, 14

CuParcone A High-Performance Evolvable Neural Network Model

An algorithm for evolving recurrent neural network via the genetic algorithm was implemented on the CUDA, resulting in a system called CuParcone (CUDA based Partially Connected Neural Evolutionary). Run on a Nvidia Tesla "GPU supercomputer," CuParcone achieves a performance increase of 323 times in face gender recognition compared to the comparable Parcone algorithm on a […]

CUDA

Jun, 11

2nd International Workshop on GPUs and Scientific Applications, GPUScA 2011

Held in conjunction with PACT 2011. GPUs are cost-effective platforms for computational intensive applications providing tremendous peak performance. However, it is a major challenge to deliver the intrinsic performance of such architectures to end applications. The goal of this workshop is to bring together GPU experts with computational science experts. The workshop addresses programming approaches […]

Jun, 11

IEEE 18th International Symposium on High Performance Computer Architecture, HPCA 2012

The International Symposium on High-Performance Computer Architecture (HPCA 2012) provides a high-quality forum for scientists and engineers to present their latest research findings in this rapidly-changing field. Authors are invited to submit papers on all aspects of high-performance computer architecture. Topics of interest include, but are not limited to: Processor, cache and memory architectures Parallel […]

Jun, 10

Accelerating light scattering simulations of nanostructures by reconfigurable computing

In order to characterize nanostructures and nanosurfaces in production processes, measuring methods based on light scattering gain increasing importance. Thus the simulation capability of laser light scattering on surfaces with a size of several hundred or thousand wavelenghts in diameter and light scattering models on the nanometer scale are required to validate these new measurement […]

Jun, 10

Massively LDPC Decoding on Multicore Architectures

Unlike usual VLSI approaches necessary for the computation of intensive Low-Density Parity-Check (LDPC) code decoders, this paper presents flexible software-based LDPC decoders. Algorithms and data structures suitable for parallel computing are proposed in this paper to perform LDPC decoding on multicore architectures. To evaluate the efficiency of the proposed parallel algorithms, LDPC decoders were developed […]

CUDA

Jun, 10

CUDA Based Fast Implementation of Very Large Matrix Computation

CUDA (Compute Unified Device Architecture) acceleration of very large scale matrix-vector and matrix-matrix multiplication is presented in this paper. The intrinsic parallelism in the matrix computations are exploited thoroughly. By dividing the entire matrix computation to multiple sub-groups, scalable performance improvement can be achieved using multiple GPUs. The key operations are accelerated by GPU. And […]

CUDA

Jun, 10

Planetary-Scale Terrain Composition

Many interrelated planetary height map and surface image map data sets exist, and more data are collected each day. Broad communities of scientists require tools to compose these data interactively and explore them via real-time visualization. While related, these data sets are often unregistered with one another, having different projection, resolution, format, and type. We […]

Jun, 10

The Research of Real-Time Shadow Rendering Algorithm of Virtual Scenes

Shadow scenes by shadow mapping has long suffered from the problem of under-sampling artifacts due to too little shadow map resolution leading to so-called perspective and projection aliasing. On this issue, we present a new practical real-time shadow mapping algorithm. Firstly we sample the scene from the eye-point on the GPU to get the needed […]

Jun, 10

Accelerating Multi-layer Perceptron based short term demand forecasting using Graphics Processing Units

Load forecasting plays a vitally important role in the operation and planning of the power system in a deregulated electricity market. A large variety of methods have been proposed for load forecasting. In this paper, we introduce the Graphics Processing Units (GPU) based computing to accelerate the short term load forecasting with multi-layer perceptron (MLP). […]