SIMD Floating Point Extension for Ray Tracing

Yunus Okmen
Department of Electrical Engineering, Faculty of Electrical Engineering, Mathematics and Computer Science, Mekelweg 4, 2628 CD Delft, The Netherlands
Delft University of Technology, 2011


   title={SIMD Floating Point Extension for Ray Tracing},

   author={Okmen, Y.},



Download Download (PDF)   View View   Source Source   



In the last decade, the importance of graphics capabilities have become very important in the mobile market. As a result low power embedded solutions for mobile devices have been eveloped to run computationally intensive graphics applications, which extensively uses floating point calculations. The work proposed in this thesis target the extension of the Silicon Hive processors capabilities for graphics applications. The Silicon Hive core generation flow that allows to introduce a very high degree of parallelism can be efficiently used to generate a processor for graphics. In order to achieve that, in this thesis, we present an hybrid VLIW/SIMD floating point processor derived from the base Silicon Hive VLIW architecture, Pearl Ray. The hardware mplementation of floating point functional units is realized using the Synopsys DesignWare building blocks, which are designed in a way that allows the efficient use of register retiming option in the Design Compiler flow, in order to introduce pipeline stages and improve the timing. The proposed architecture can process 8-way vectors, consisting of 32-bit vector elements. To evaluate the efficiency of the proposed architecture a Ray Tracing algorithm, has been mapped on the developed processor. We have shown that the ray tracing algorithm efficiently exploits the full power of floating point vector instructions and also the instruction level parallelism provided by both the VLIW and the SIMD (Vector) nature of our processor. The results shown that a close-to-linear speed-up can be achieved for the Ray Tracing algorithm using the proposed architecture. Finally, the performance of the proposed extended VLIW/SIMD floating point processor has been compared with a very high-end graphic processing unit and a general purpose processor, in terms of number of cycles, total execution time and power consumption on the same Ray Tracing algorithm. The results show that proposed extended Silicon Hive processor can compete with both the GPU and the CPU in terms of execution times. Furthermore, it overperforms the 8 core machine after the execution time is normalized with respect to corresponding clock frequencies. The overall performance on the GPU is slightly better than the proposed processor. However, the advantage of our extended embedded processor becomes clear when the area and the power consumption values are taken into account. Whereas the GPU and the CPU consumes around 140 watt and 85 watt power, respectively, our floating point VLIW/SIMD processor consumes only 0.2-0.3 watt.
No votes yet.
Please wait...

* * *

* * *

HGPU group © 2010-2017 hgpu.org

All rights belong to the respective authors

Contact us: