Neural Network Implementation Using CUDA and OpenMP

Honghoon Jang, Anjin Park, Keechul Jung
Department of Digital Media, Soongsil University
Computing: Techniques and Applications, 2008. DICTA ’08.Digital Image In DICTA ’08: Proceedings of the 2008 Digital Image Computing: Techniques and Applications (08 December 2008), pp. 155-161.


   title={Neural network implementation using cuda and openmp},

   author={Jang, H. and Park, A. and Jung, K.},

   booktitle={Digital Image Computing: Techniques and Applications},





Source Source   



Many algorithms for image processing and pattern recognition have recently been implemented on GPU (graphic processing unit) for faster computational times. However, the implementation using GPU encounters two problems. First, the programmer should master the fundamentals of the graphics shading languages that require the prior knowledge on computer graphics. Second, in a job which needs much cooperation between CPU and GPU, which is usual in image processings and pattern recognitions contrary to the graphics area, CPU should generate raw feature data for GPU processing as much as possible to effectively utilize GPU performance. This paper proposes more quick and efficient implementation of neural networks on both GPU and multi-core CPU. We use CUDA (compute unified device architecture) that can be easily programmed due to its simple C language-like style instead of GPU to solve the first problem. Moreover, OpenMP (Open Multi-Processing) is used to concurrently process multiple data with single instruction on multi-core CPU, which results ineffectively utilizing the memories of GPU. In the experiments, we implemented neural networks-based text detection system using the proposed architecture, and the computational times showed about 15 times faster than implementation using CPU and about 4 times faster than implementation on only GPU without OpenMP.
No votes yet.
Please wait...

* * *

* * *

HGPU group © 2010-2017 hgpu.org

All rights belong to the respective authors

Contact us: