high performance computing on graphics processing units: hgpu.org

hgpu.org » Applications » Computer science » A Translation Framework for Executing the Sequential Binary Code on CPU/GPU Based Architectures

A Translation Framework for Executing the Sequential Binary Code on CPU/GPU Based Architectures

Erzhou Zhu, Haibing Guan, Guoxing Dong, Yindong Yang, Hongbo Yang

Department of Computer Science and Engineering and Shanghai Key Laboratory of Scalable Computing and Systems, Shanghai Jiaotong University, Shanghai, China

Journal of Software, Volume 6, No. 12, 2011

DOI:10.4304/jsw.6.12.2331-2340

@article{zhu2011translation,

title={A Translation Framework for Executing the Sequential Binary Code on CPU/GPU Based Architectures},

author={Zhu, E. and Guan, H. and Dong, G. and Yang, Y. and Yang, H.},

journal={Journal of Software},

volume={6},

number={12},

pages={2331–2340},

year={2011}

}

Download (PDF)

View

Source

2095

views

The method of using DBT (dynamic binary translation) to execute the source ISAs binary code on target platforms has been perplexed by low overhead for many years. GPU as a many-core processor has tremendous computational power. Employing GPU as a coprocessor to parallel execute the hot spot of binary code hold a great promise of substantially reduce the overhead of DBT. This paper presents a novel translation framework for constructing the virtual execution environment aiming at accelerating the process of DBT on CPU/GPU based architectures. With parallelizable parts (hot spots) of binary code and their related information, the framework converts the sequential code into PTX form and executes them on GPUs. Under the framework, we need not to rewrite the source code, and the binary compatibility issues between different GPUs are also resolved properly. Experimental results on several programs from CUDA SDK Code Samples and Parboil Benchmark Suite show that the framework can significantly improve the performance, usually have 10X speedup on average compared to X86 native platforms. Especially, when the scale of input become larger, the performance becomes even better.

Tags: Benchmarking, Code generation, Computer science, CUDA, nVidia, nVidia GeForce GTX 260, PTX

December 18, 2011 by hgpu

No votes yet.

Please wait...