An Adaptative Multi-GPU based Branch-and-Bound. A Case Study: the Flow-Shop Scheduling Problem
Universite Lille 1, LIFL/UMR CNRS 8022 – INRIA Lille Nord Europe, 59655 – Villeneuve d’Ascq cedex – France
14th IEEE International Conference on High Performance Computing and Communications, HPCC 2012
@article{2012arXiv1206.4973C,
author={Chakroun}, I. and {Melab}, N.},
title={"{An Adaptative Multi-GPU based Branch-and-Bound. A Case Study: the Flow-Shop Scheduling Problem}"},
journal={ArXiv e-prints},
archivePrefix={"arXiv"},
eprint={1206.4973},
primaryClass={"cs.DC"},
keywords={Computer Science – Distributed, Parallel, and Cluster Computing, Computer Science – Robotics},
year={2012},
month={jun},
adsurl={http://adsabs.harvard.edu/abs/2012arXiv1206.4973C},
adsnote={Provided by the SAO/NASA Astrophysics Data System}
}
Solving exactly Combinatorial Optimization Problems (COPs) using a Branch-and-Bound (B&B) algorithm requires a huge amount of computational resources. Therefore, we recently investigated designing B&B algorithms on top of graphics processing units (GPUs) using a parallel bounding model. The proposed model assumes parallelizing the evaluation of the lower bounds on pools of sub-problems. The results demonstrated that the size of the evaluated pool has a significant impact on the performance of B&B and that it depends strongly on the problem instance being solved. In this paper, we design an adaptative parallel B&B algorithm for solving permutation-based combinatorial optimization problems such as FSP (Flow-shop Scheduling Problem) on GPU accelerators. To do so, we propose a dynamic heuristic for parameter auto-tuning at runtime. Another challenge of this work is to exploit larger degrees of parallelism by using the combined computational power of multiple GPU devices. The approach has been applied to the permutation flow-shop problem. Extensive experiments have been carried out on well-known FSP benchmarks using an Nvidia Tesla S1070 Computing System equipped with two Tesla T10 GPUs. Compared to a CPU-based execution, accelerations up to 105 are achieved for large problem instances.
June 25, 2012 by hgpu