PTX2Kernel: Converting PTX Code into Compilable Kernels

hgpu.org » Applications » Computer science » PTX2Kernel: Converting PTX Code into Compilable Kernels

PTX2Kernel: Converting PTX Code into Compilable Kernels

Qiankun Dong, Tao Li, Shuai Zhang, Xiaofan Jiao, Jiabing Leng

College of Computer and Control Engineering, Nankai University, Tianjin 300071, China

8th Workshop on Architectural and Microarchitectural Support for Binary Translation, 2015

BibTeX

Download (PDF)

View

Source

2759

views

GPUs are now widely used as high performance general purpose computing devices. More and more applications have achieved large speedups with one or more GPUs, and the number of GPU programs is growing fast. In certain situations, the high level CUDA C code of kernels is not available, but low level PTX code can be extracted from binary files. It will be very useful if the PTX code could be converted into editable and compilable kernels, hence programmers can modify or tune the converted kernels to make a best fit version for specific applications at the CUDA C level. To the best of our knowledge, however, there is no such tool that supports the code conversion from PTX to kernel at present. In this paper, we propose the PTX2Kernel convertor for converting embedded PTX code into editable and compilable CUDA C kernels without efficiency loss. The converted kernels have legal CUDA C interfaces, and PTX instructions are inlined in the CUDA kernel bodies. With the PTX2Kernel convertor, it is much easier for programmers to make optimized kernel versions in case that only embedded PTX code is available. Two real world cases, the 4-mat GEMM and the band matrix multiplication, are used to demonstrate the flexibility that the PTX2Kernel convertor provided for code optimization.

Tags: Computer science, CUDA, Matrix multiplication, nVidia, nVidia GeForce GTX 480, PTX, Tesla K40

March 22, 2015 by hgpu

Rating: 2.0/5. From 5 votes.

Please wait...

hpcbench: A set of benchmarking utilities for biomolecular simulation tools

Engineering Supercomputing Platforms for Biomolecular Applications

high performance computing on graphics processing units: hgpu.org