GHOST: GPGPU-Offloaded High Performance Storage I/O Deduplication for Primary Storage System
KAIST, Daejeon, South Korea
Proceedings of the 2012 International Workshop on Programming Models and Applications for Multicores and Manycores (PMAM ’12), 2012
@inproceedings{kim2012ghost,
title={GHOST: GPGPU-offloaded high performance storage I/O deduplication for primary storage system},
author={Kim, C. and Park, K.W. and Park, K.H.},
booktitle={Proceedings of the 2012 International Workshop on Programming Models and Applications for Multicores and Manycores},
pages={17–26},
year={2012},
organization={ACM}
}
Data deduplication has been an effective way to eliminate redundant data mainly for backup storage systems. Since the recent primary storage systems in cloud services are expected to have the redundancy, the deduplication technique can also bring significant cost saving for the primary storage. However, the primary storage system requires high performance requirement about several GBs per second. Most conventional deduplication techniques targeted the performance requirement of 200-300MB/s. In an attempt to achieve a high performance storage deduplication system at the primary storage, we thoroughly analyze the performance bottleneck of previous deduplication systems to enhance the system to meet the requirement of the primary storage. The new performance bottleneck of deduplication in the primary storage lies on not only key-value store lookup, also computation for data segmentation and fingerprinting due to recent technology improvement of flash devices such as SSD. To overcome the bottlenecks, we propose a new deduplication system utilizing GPGPU. Our proposed system, termed GHOST, includes the followings to offload and optimize the deduplication processing in GPGPU: (1) In-Host Data Cache, (2) Destage-aware Data offloading to GPGPU and (3) In-GPGPU Table Cache of key-value store. These techniques improve the offloaded deduplication processing about 10-20% on the reasonable workload of the primary storage compared to the naive approach. Our proposed deduplication system can achieve 1.5GB/s in maximum which is about 5 times of the deduplication systems used CPU only.
March 31, 2012 by hgpu