Heterogeneous Computing and Load Balancing Techniques for Monte Carlo Simulation in a Distributed Environment
Computer Science and Engineering, Ohio State University
Ohio State University, 2011
@phdthesis{deshpande2011heterogeneous,
title={HETEROGENEOUS COMPUTING AND LOAD BALANCING TECHNIQUES FOR MONTE CARLO SIMULATION IN A DISTRIBUTED ENVIRONMENT},
author={Deshpande, I.},
year={2011},
school={The Ohio State University}
}
CPU-GPU clusters have emerged as a dominant HPC platform, with the three of the four fastest supercomputers in the world falling in this category. The reasons for the popularity of these environments include their cost-effectiveness and energy efficiency. The need for exploiting both the CPU and GPU on each node of such platforms has created a renewed interest in heterogeneous computing [14]. Implementation of such a heterogeneous system on a cluster is a challenge. At the same time, FREERIDE – a map-reduce like framework can be used efficiently to develop data-intensive applications on clusters and multi-core systems, because of its simplicity and robustness. In this thesis, we are developing a heterogeneous implementation on a CPU-GPU cluster for a Monte Carlo Simulation application using FREERIDE – a map-reduce like framework based on the generalized reduction. We show through experiments, the support for enabling scalable and efficient implementation of data-intensive applications in a heterogeneous cluster of many-core GPUs and CPUs. Our contributions are 2 fold: 1) develop heterogeneous version of Monte Carlo application for distributed environment using FREERIDE APIs; 2) We present a new approach of load balancing between a CPU and a GPU on a node to better utilize the computing power of CPUs and/or GPUs. We evaluate our heterogeneous implementation on a cluster. We show an almost linear speedup on this cluster over execution with 1 CPU core, 1 GPU core and a combination of 1 CPU and 1 GPU cores respectively. Our application also achieve an improvement of 20% by using CPUs and GPUs simultaneously, over the best performance achieved from using only one of the types of resources in the cluster using the new load balancing technique.
November 25, 2011 by hgpu