PTX2Kernel: Converting PTX Code into Compilable Kernels

Qiankun Dong, Tao Li, Shuai Zhang, Xiaofan Jiao, Jiabing Leng
College of Computer and Control Engineering, Nankai University, Tianjin 300071, China
8th Workshop on Architectural and Microarchitectural Support for Binary Translation, 2015


   title={PTX2Kernel: Converting PTX Code into Compilable Kernels},

   author={Dong, Qiankun and Li, Tao and Zhang, Shuai and Jiao, Xiaofan and Leng, Jiabing},



Download Download (PDF)   View View   Source Source   



GPUs are now widely used as high performance general purpose computing devices. More and more applications have achieved large speedups with one or more GPUs, and the number of GPU programs is growing fast. In certain situations, the high level CUDA C code of kernels is not available, but low level PTX code can be extracted from binary files. It will be very useful if the PTX code could be converted into editable and compilable kernels, hence programmers can modify or tune the converted kernels to make a best fit version for specific applications at the CUDA C level. To the best of our knowledge, however, there is no such tool that supports the code conversion from PTX to kernel at present. In this paper, we propose the PTX2Kernel convertor for converting embedded PTX code into editable and compilable CUDA C kernels without efficiency loss. The converted kernels have legal CUDA C interfaces, and PTX instructions are inlined in the CUDA kernel bodies. With the PTX2Kernel convertor, it is much easier for programmers to make optimized kernel versions in case that only embedded PTX code is available. Two real world cases, the 4-mat GEMM and the band matrix multiplication, are used to demonstrate the flexibility that the PTX2Kernel convertor provided for code optimization.
Rating: 1.5. From 4 votes.
Please wait...

* * *

* * *

HGPU group © 2010-2017 hgpu.org

All rights belong to the respective authors

Contact us: