Darknet on OpenCL: a multi-platform tool for object detection and classification

hgpu.org » Applications » Computer science » Darknet on OpenCL: a multi-platform tool for object detection and classification

Darknet on OpenCL: a multi-platform tool for object detection and classification

Piotr Sowa, Jacek Izydorczyk

Aptiv Technical Center, Kraków, Poland

Preprints 2020070506, 2020

DOI:10.20944/preprints202007.0506.v1

BibTeX

Download (PDF)

View

Source

Source codes

Package:

Darknet on OpenCL: a multi-platform tool for object detection and classification

2947

views

The article’s goal is to overview challenges and problems on the way from the state of the art CUDA accelerated neural networks code to multi-GPU code. For this purpose, the authors describe the journey of porting the existing in the GitHub, fully-featured CUDA accelerated Darknet engine to OpenCL. The article presents lessons learned and the techniques that were put in place to make this port happen. There are few other implementations on the GitHub that leverage the OpenCL standard, and a few have tried to port Darknet as well. Darknet is a well known convolutional neural network (CNN) framework. The authors of this article investigated all aspects of the porting and achieved the fully-featured Darknet engine on OpenCL. The effort was focused not only on the classification with the use of YOLO1, YOLO2, and YOLO3 CNN models. They also covered other aspects, such as training neural networks, and benchmarks to look for the weak points in the implementation. The GPU computing code substantially improves Darknet computing time compared to the standard CPU version by using underused hardware in existing systems. If the system is OpenCL-based, then it is practically hardware independent. In this article, the authors report comparisons of the computation and training performance compared to the existing CUDA-based Darknet engine in the various computers, including single board computers, and, different CNN use-cases. The authors found that the OpenCL version could perform as fast as the CUDA version in the compute aspect, but it is slower in memory transfer between RAM (CPU memory) and VRAM (GPU memory). It depends on the quality of OpenCL implementation only. Moreover, loosening hardware requirements by the OpenCL Darknet can boost applications of DNN, especially in the energy-sensitive applications of Artificial Intelligence (AI) and Machine Learning (ML).

Tags: ARM, Artificial intelligence, Benchmarking, Computer science, CUDA, Deep learning, Machine learning, Neural networks, nVidia, nVidia Titan RTX, OpenCL, Package

July 26, 2020 by hgpu

No votes yet.

Please wait...

Your response

You must be logged in to post a comment.

* * *

high performance computing on graphics processing units: hgpu.org