An Accelerator based on the rho-VEX Processor: an Exploration using OpenCL

hgpu.org » Applications » Computer science » An Accelerator based on the rho-VEX Processor: an Exploration using OpenCL

An Accelerator based on the rho-VEX Processor: an Exploration using OpenCL

Hugo van der Wijst

Department of Electrical Engineering, Faculty of Electrical Engineering, Mathematics and Computer Science, Delft University of Technology

Delft University of Technology, 2015

BibTeX

Download (PDF)

View

Source

1982

views

In recent years the use of co-processors to accelerate specific tasks is becoming more common. To simplify the use of these accelerators in software, the OpenCL framework has been developed. This framework provides programs a cross-platform interface for using accelerators. The rho-VEX processor is a run-time reconfigurable VLIW processor. It allows run-time switching of configurations, executing a large amount of contexts with low issue-width or a low amount of contexts with high issue-width. This thesis investigates if the rho-VEX processor can be competitively used as an accelerator using OpenCL. To answer this question, a design and implementation is made of such an accelerator. By measuring the speed of various components of this implementation, a model is created for the run-time of a kernel. Using this model, a projection is made of the execution time on an accelerator produced as an ASIC. For the implementation of the accelerator, the rho-VEX processor is instantiated on an FPGA and connected to the host using the PCI Express bus. A Linux kernel driver has been developed to provide interfaces for user space applications to communicate with the accelerator. These interfaces are used to implement a new device-layer for the pocl OpenCL framework. By modeling the execution time, three major contributing factors to the execution time were found: the data transfer throughput, the kernel compile time, and the kernel execution time. It is projected that an accelerator based on the rho-VEX processor, using similar production technologies and without architectural changes, can achieve 1.2 to 0.11 times the performance of a modern GPU.

Tags: Computer science, FPGA, hardware acceleration, OpenCL, Thesis

December 4, 2015 by hgpu

No votes yet.

Please wait...

hpcbench: A set of benchmarking utilities for biomolecular simulation tools

Engineering Supercomputing Platforms for Biomolecular Applications

high performance computing on graphics processing units: hgpu.org