Sep, 1

Performance Evaluations of Graph Database using CUDA and OpenMP-Compatible Libraries

Graph databases use graph structures to store data sets as nodes, edges, and properties. They are used to store and search the relationships between a large number of nodes, such as social networking services and recommendation engines that use customer social graphs. Since computation cost for graph search queries increases as the graph becomes large, […]
Sep, 1

Fast reconstruction of 3D volumes from 2D CT projection data with GPUs

Biomedical image reconstruction applications require producing high fidelity images in or close to real-time. We have implemented reconstruction of three dimensional conebeam computed tomography(CBCT) with two dimensional projections. The algorithm takes slices of the target, weights and filters them to backproject the data, then creates the final 3D volume. We have implemented the algorithm using […]
Sep, 1

Scalable Kernel Fusion for Memory-Bound GPU Applications

GPU implementations of HPC applications relying on finite difference methods can include tens of kernels that are memory-bound. Kernel fusion can improve performance by reducing data traffic to off-chip memory; kernels that share data arrays are fused to larger kernels where on-chip cache is used to hold the data reused by instructions originating from different […]
Aug, 27

Optimization of Data-Parallel Scientific Applications on Highly Heterogeneous Modern HPC Platforms

Over the past decade, the design of microprocessors has been shifting to a new model where the microprocessor has multiple homogeneous processing units, aka cores, as a result of heat dissipation and energy consumption issues. Meanwhile, the demand for heterogeneity increases in computing systems due to the need for high performance computing in recent years. […]
Aug, 27

Surface Normal Integration for Convex Space-time Multi-view Reconstruction

We show that surface normal information allows to significantly improve the accuracy of a spatio-temporal multi-view reconstruction. On one hand, normal information can improve the quality of photometric matching scores. On the other hand, the same normal information can be employed to drive an adaptive anisotropic surface regularization process which better preserves fine details and […]
Aug, 27

High Performance Financial Simulation Using Randomized Quasi-Monte Carlo Methods

GPU computing has become popular in computational finance and many financial institutions are moving their CPU based applications to the GPU platform. Since most Monte Carlo algorithms are embarrassingly parallel, they benefit greatly from parallel implementations, and consequently Monte Carlo has become a focal point in GPU computing. GPU speed-up examples reported in the literature […]
Aug, 27

A Framework for Lattice QCD Calculations on GPUs

Computing platforms equipped with accelerators like GPUs have proven to provide great computational power. However, exploiting such platforms for existing scientific applications is not a trivial task. Current GPU programming frameworks such as CUDA C/C++ require low-level programming from the developer in order to achieve high performance code. As a result porting of applications to […]
Aug, 27

Algorithms for Solving Non-Stationary Heat Conduction Problem for Design of a Technical Device

A model of a multilayer device with non-trivial geometrical and material structure and its working process is suggested. The thermal behavior of the device as one principle characteristic is simulated. The algorithm for solving the non-stationary heat conduction problem with a time-dependent periodical heating source is suggested. The algorithm is based on finite difference explicit–implicit […]
Aug, 26

HSApriori: High Speed Association Rule Mining using Apriori Based Algorithm for GPU

Apriori-Based algorithms are widely used for association rule mining. However, these algorithms cannot exploit the parallel processing power of modern GPU (Graphics Processing Unit). To make an algorithm to be compatible with GPU, it needs to be changed in representation of data, parallel processing and also in support count. In this paper we propose an […]
Aug, 26

Bandwidth Requirements of GPU Architectures

A new trend in chip multiprocessor (CMP) design is to incorporate graphics processing unit (GPU) cores, making them heterogeneous. GPU cores have a higher bandwidth requirement than CPU cores, as they tend to generate much more memory requests. In order to achieve good performance, there must be sufficient bandwidth between the GPU shader cores and […]
Aug, 26

An Investigation of Unified Memory Access Performance in CUDA

Managing memory between the CPU and GPU is a major challenge in GPU computing. A programming model, Unified Memory Access (UMA), has been recently introduced by Nvidia to simplify the complexities of memory management while claiming good overall performance. In this paper, we investigate this programming model and evaluate its performance and programming model simplifications […]
Aug, 26

Acceleration of Various Direct/Iterative Solvers for MoM by GPU and Its Computational Cost

Various guidelines for acceleration of MoM by GPU computing are summarized. Acceleration of direct/iterative solver for MoM by using GPU is realized. Quantitative study of computing time shows the performance of each guideline.
Page 1 of 74812345...102030...Last »

* * *

* * *

Like us on Facebook

HGPU group

142 people like HGPU on Facebook

Follow us on Twitter

HGPU group

1223 peoples are following HGPU @twitter

Featured events

* * *

Free GPU computing nodes at hgpu.org

Registered users can now run their OpenCL application at hgpu.org. We provide 1 minute of computer time per each run on two nodes with two AMD and one nVidia graphics processing units, correspondingly. There are no restrictions on the number of starts.

The platforms are

Node 1
  • GPU device 0: AMD/ATI Radeon HD 5870 2GB, 850MHz
  • GPU device 1: AMD/ATI Radeon HD 6970 2GB, 880MHz
  • CPU: AMD Phenom II X6 @ 2.8GHz 1055T
  • RAM: 12GB
  • OS: OpenSUSE 13.1
  • SDK: AMD APP SDK 2.9
Node 2
  • GPU device 0: AMD/ATI Radeon HD 7970 3GB, 1000MHz
  • GPU device 1: nVidia GeForce GTX 560 Ti 2GB, 822MHz
  • CPU: Intel Core i7-2600 @ 3.4GHz
  • RAM: 16GB
  • OS: OpenSUSE 12.2
  • SDK: nVidia CUDA Toolkit 6.0.1, AMD APP SDK 2.9

Completed OpenCL project should be uploaded via User dashboard (see instructions and example there), compilation and execution terminal output logs will be provided to the user.

The information send to hgpu.org will be treated according to our Privacy Policy

HGPU group © 2010-2014 hgpu.org

All rights belong to the respective authors

Contact us: