Xiaowen Chu, Chengjian Liu, Kai Ouyang, Ling Sing Yung, Hai Liu, Yiu-Wing Leung
In recent years, erasure coding has been adopted by large-scale cloud storage systems to replace data replication. With the increase of disk I/O throughput and network bandwidth, the speed of erasure coding becomes one of the key system bottlenecks. In this paper, we propose to offload the task of erasure coding to Graphics Processing Units […]
View View   Download Download (PDF)   
Johann A. Briffa, Stephan Wesemeyer
In this study, we present SimCommSys, a simulator of communication systems that we are releasing under an open source license. The core of the project is a set of C + + libraries defining communication system components and a distributed Monte Carlo simulator. Of principal interest is the error-control coding component, where various kinds of […]
Eugen Ruzicky, Markus Rupp, Peter Farkas, Atilio Gameiro
In this paper a novel approximate algorithm for massively-parallel decoding of trellis based error correcting codes (ECC) is presented. The potential effect of using such optimized decoder on acceleration of simulations of modern communication systems implementing the most recent communication standards, such as LTE-A (Long Term Evolution – Advanced) is evaluated quantitatively by presenting an […]
View View   Download Download (PDF)   
Michael Wu, Yang Sun, Guohui Wang, Joseph R. Cavallaro
Turbo code is a computationally intensive channel code that is widely used in current and upcoming wireless standards. General-purpose graphics processor unit (GPGPU) is a programmable commodity processor that achieves high performance computation power by using many simple cores. In this paper, we present a 3GPP LTE compliant Turbo decoder accelerator that takes advantage of […]
View View   Download Download (PDF)   
Keun Soo Yim, Cuong Pham, Mushfiq Saleheen, Zbigniew Kalbarczyk, Ravishankar Iyer
High performance and relatively low cost of GPU-based platforms provide an attractive alternative for general purpose high performance computing (HPC). However, the emerging HPC applications have usually stricter output correctness requirements than typical GPU applications (i.e., 3D graphics). This paper first analyzes the error resiliency of GPGPU platforms using a fault injection tool we have […]
View View   Download Download (PDF)   
Matthew L. Curry, Anthony Skjellum, H. Lee Ward, Ron Brightwell
Reed-Solomon coding is a method of generating arbitrary amounts of checksum information from original data via matrix-vector multiplication in finite fields. Previous work has shown that CPUs are not well-matched to this type of computation, but recent graphical processing units (GPUs) have been shown through a case study to perform this encoding quickly for the […]
View View   Download Download (PDF)   
Gabriel Falcao Paiva Fernandes, Vitor Manuel Mendes da Silva, Marco Alexandre Cravo Gomes, Leonel Augusto Pires Seabra de Sousa
Low-Density Parity-Check (LDPC) codes are among the best error correcting codes known and have been adopted by data transmission standards, such as DVB-S2 or WiMax. They are based on binary sparse parity check matrices and usually represented by Tanner graphs. LDPC decoders require very intensive message-passing algorithms, also known as belief propagation. This paper proposes […]
Matthew L. Curry, H. Lee Ward, Anthony Skjellum, Ron Brightwell
While RAID is the prevailing method of creating reliable secondary storage infrastructure, many users desire more flexibility than offered by current implementations. Traditionally, RAID capabilities have been implemented largely in hardware in order to achieve the best performance possible, but hardware RAID has rigid designs that are costly to change. Software implementations are much more […]
View View   Download Download (PDF)   
G. Falcao, J. Andrade, V. Silva, L. Sousa
A new strategy is proposed for implementing computationally intensive high-throughput decoders based on the long length irregular LDPC codes adopted in the DVB-S2 standard. It is supported on manycore graphics processing unit (GPU) architectures, for performing parallel multi-threaded decoding of multiple codewords with reduced accesses to global memory. This novel approach is flexible and scalable, […]
Hyunwoo Ji, Junho Cho, Wonyong Sung
Simulation of low-density parity-check (LDPC) codes frequently takes several days, thus the use of general purpose graphics processing units (GPGPUs) is very promising. However, GPGPUs are designed for compute-intensive applications, and they are not optimized for data caching or control management. In LDPC decoding, the parity check matrix H needs to be accessed at every […]
Naoya Maruyama, Akira Nukada, Satoshi Matsuoka
Commodity off-the-shelf GPUs lack error checking mechanisms for graphics memory, whereas conventional HPC platforms have used hardware-based ECC for DRAMs. To alleviate this reliability concern, we propose a software-based ECC for GPGPU applications. We add small program codes to normal CUDA programs that compute ECCs for data residing in graphics memory so that transient bit-flips […]
View View   Download Download (PDF)   
Thomas Steinke, Kathrin Peter, and Sebastian Borchert
The Cauchy variant of the Reed-Solomon algorithm is implemented on accelerator platforms including GPGPU, FPGA, CellBE and ClearSpeed as well as on a x86 multi-core system. The sustained throughput performance and kernel rates are measured for a 5+3 Reed-Solomon schema. To compare the different technology platforms an efficiency is introduced and the platforms are categorized […]
View View   Download Download (PDF)   
Page 1 of 212

* * *

* * *

Follow us on Twitter

HGPU group

1655 peoples are following HGPU @twitter

Like us on Facebook

HGPU group

334 people like HGPU on Facebook

* * *

Free GPU computing nodes at hgpu.org

Registered users can now run their OpenCL application at hgpu.org. We provide 1 minute of computer time per each run on two nodes with two AMD and one nVidia graphics processing units, correspondingly. There are no restrictions on the number of starts.

The platforms are

Node 1
  • GPU device 0: nVidia GeForce GTX 560 Ti 2GB, 822MHz
  • GPU device 1: AMD/ATI Radeon HD 6970 2GB, 880MHz
  • CPU: AMD Phenom II X6 @ 2.8GHz 1055T
  • RAM: 12GB
  • OS: OpenSUSE 13.1
  • SDK: nVidia CUDA Toolkit 6.5.14, AMD APP SDK 3.0
Node 2
  • GPU device 0: AMD/ATI Radeon HD 7970 3GB, 1000MHz
  • GPU device 1: AMD/ATI Radeon HD 5870 2GB, 850MHz
  • CPU: Intel Core i7-2600 @ 3.4GHz
  • RAM: 16GB
  • OS: OpenSUSE 12.3
  • SDK: AMD APP SDK 3.0

Completed OpenCL project should be uploaded via User dashboard (see instructions and example there), compilation and execution terminal output logs will be provided to the user.

The information send to hgpu.org will be treated according to our Privacy Policy

HGPU group © 2010-2015 hgpu.org

All rights belong to the respective authors

Contact us: