OpenCL-Based Erasure Coding on Heterogeneous Architectures
North Carolina State University, Raleigh, NC
27th IEEE International Conference on Application-specific Systems, Architectures and Processors (ASAP 2016), 2016
@article{chen2016opencl,
title={OpenCL-Based Erasure Coding on Heterogeneous Architectures},
author={Chen, Guoyang and Zhou, Huiyang and Shen, Xipeng and Gahm, Josh and Venkat, Narayan and Booth, Skip and Marshall, John},
year={2016}
}
Erasure coding, Reed-Solomon coding in particular, is a key technique to deal with failures in scale-out storage systems. However, due to the algorithmic complexity, the performance overhead of erasure coding can become a significant bottleneck in storage systems attempting to meet service level agreements (SLAs). Previous work has mainly leveraged SIMD (singleinstruction multiple-data) instruction extensions in general purpose processors to improve the processing throughput. In this work, we exploit state-of-art heterogeneous architectures, including GPUs, APUs, and FPGAs, to accelerate erasure coding. We leverage the OpenCL framework for our target heterogeneous architectures and propose code optimizations for each target architecture. Given their different hardware characteristics, we highlight the different optimization strategies for each of the target architectures. Using the throughput metric as the ratio of the input file size over the processing latency, we achieve 2.84 GB/s on a 28-core Xeon CPU, 3.90 GB/s on an NVIDIA K40m GPU, 0.56 GB/s on an AMD Carrizo APU, and 1.19 GB/s (5.35 GB/s if only considering the kernel execution latency) on an Altera Stratix V FPGA, when processing a 836.9MB zipped file with a 30×33 encoding matrix. In comparison, the single-thread code using the Intel’s ISA-L library running on the Xeon CPU has the throughput of 0.13 GB/s.
June 14, 2016 by hgpu