Apr, 15

The development of Mellanox/NVIDIA GPUDirect over InfiniBand-a new model for GPU to GPU communications

The usage and adoption of General Purpose GPUs (GPGPU) in HPC systems is increasing due to the unparalleled performance advantage of the GPUs and the ability to fulfill the ever-increasing demands for floating points operations. While the GPU can offload many of the application parallel computations, the system architecture of a GPU-CPU-InfiniBand server does require […]
Apr, 14

Massively parallel implementation of cyclic LDPC codes on a general purpose graphics processing unit

Simulation of low-density parity-check (LDPC) codes frequently takes several days, thus the use of general purpose graphics processing units (GPGPUs) is very promising. However, GPGPUs are designed for compute-intensive applications, and they are not optimized for data caching or control management. In LDPC decoding, the parity check matrix H needs to be accessed at every […]
Apr, 14

Count Sort for GPU Computing

Counting sort is a simple, stable and efficient sort algorithm with linear running time, which is a fundamental building block for many applications. This paper depicts the design issues of a data parallel implementation of the count sort algorithm on a commodity multiprocessor GPU using the Compute Unified Device Architecture (CUDA) platform, both from NVIDIA […]
Apr, 14

GPU-based high-speed and high-precision visual tracking

This paper presents a method for implementing the ESM visual tracker proposed by Malis et al. on a GPU to realize fast and accurate visual tracking. The ESM tracker is effective especially for the images in which feature points are difficult to obtain, since it uses entire image pixels of the target image region. Although […]
Apr, 14

A high-performance fault-tolerant software framework for memory on commodity GPUs

As GPUs are increasingly used to accelerate HPC applications by allowing more flexibility and programmability, their fault tolerance is becoming much more important than before when they were used only for graphics. The current generation of GPUs, however, does not have standard error detection and correction capabilities, such as SEC-DED ECC for DRAM, which is […]
Apr, 14

GPU acceleration of method of moments matrix assembly using Rao-Wilton-Glisson basis functions

In this paper, a GPU accelerated implementation of the matrix assembly phase of the methods of moments is presented. The modelling of PEC structures using the electric field integral equation and the Rao-Wilton-Glisson basis functions introduced in is considered. NVIDIA CUDA is used to do the GPU development and the double precision support offered by […]
Apr, 14

Accelerating spatial clustering detection of epidemic disease with graphics processing unit

The statistics of disease clustering is of interest to epidemiologists. In order to detect spatial clustering of disease in all the regions of China, we adopted a likelihood ratio based method which utilizes Monte Carlo simulation and spatial exploring to analyze the real time updating data stored in database. However, large number of random tests […]
Apr, 14

Scaleable Sparse Matrix-Vector Multiplication with Functional Memory and GPUs

Sparse matrix-vector multiplication on GPUs faces to a serious problem when the vector length is too large to be stored in GPU’s device memory. To solve this problem, we propose a novel software-hardware hybrid method for a heterogeneous system with GPUs and functional memory modules connected by PCI express. The functional memory contains huge capacity […]
Apr, 14

Efficient design and implementation of visual computing algorithms on the GPU

In this paper, we explore the key factors in the design and implementation of visual computing (image processing and computer vision) algorithms on the massive parallel GPU (graphics processing units). The goal of the exploration is to provide common perspective and guidelines of using GPU for visual computing applications. We have selected three nontrivial applications […]
Apr, 14

Implicit Feature-Based Alignment System for Radiotherapy

In this paper we present a robust alignment algorithm for correcting the effects of out-of-plane rotation to be used for automatic alignment of the Computed Tomography (CT) volumes and the generally low quality fluoroscopic images for radiotherapy applications. Analyzing not only in-plane but also out-of-plane rotation effects on the Dignitary Reconstructed Radiograph (DRR) images, we […]
Apr, 14

OpenCL/OpenGL aproach for studying active Brownian motion

This work presents a methodology for studying active Brownian dynamics on ratchet potentials using interoperating OpenCL and OpenGL frameworks. Programing details along with optimization issues are discussed, followed by a comparison of performance on different devices. Time of visualization using OpenGL sharing buffer with OpenCL has been tested against another technique which, while using OpenGL, […]
Apr, 13

23d International Conference on Parallel Computational Fluid Dynamics 2011, ParCFD 2011

ParCFD is the annual international conference devoted to the discussion of recent developments and applications of parallel computing in the field of CFD and related disciplines. Since establishment of the ParCFD conference series, parallel computers have become the dominant form of large-scale computing. Emergence of multi-core and heterogeneous architectures in parallel computers has created new […]
Page 574 of 764« First...102030...572573574575576...580590600...Last »

* * *

* * *

Like us on Facebook

HGPU group

172 people like HGPU on Facebook

Follow us on Twitter

HGPU group

1283 peoples are following HGPU @twitter

* * *

Free GPU computing nodes at hgpu.org

Registered users can now run their OpenCL application at hgpu.org. We provide 1 minute of computer time per each run on two nodes with two AMD and one nVidia graphics processing units, correspondingly. There are no restrictions on the number of starts.

The platforms are

Node 1
  • GPU device 0: AMD/ATI Radeon HD 5870 2GB, 850MHz
  • GPU device 1: AMD/ATI Radeon HD 6970 2GB, 880MHz
  • CPU: AMD Phenom II X6 @ 2.8GHz 1055T
  • RAM: 12GB
  • OS: OpenSUSE 13.1
  • SDK: AMD APP SDK 2.9
Node 2
  • GPU device 0: AMD/ATI Radeon HD 7970 3GB, 1000MHz
  • GPU device 1: nVidia GeForce GTX 560 Ti 2GB, 822MHz
  • CPU: Intel Core i7-2600 @ 3.4GHz
  • RAM: 16GB
  • OS: OpenSUSE 12.2
  • SDK: nVidia CUDA Toolkit 6.0.1, AMD APP SDK 2.9

Completed OpenCL project should be uploaded via User dashboard (see instructions and example there), compilation and execution terminal output logs will be provided to the user.

The information send to hgpu.org will be treated according to our Privacy Policy

HGPU group © 2010-2014 hgpu.org

All rights belong to the respective authors

Contact us: