Performance analysis of SSE instructions in multi-core CPUs and GPU computing on FDTD scheme for solid and fluid vibration problems

Jorge Frances Monllor, Sergio Bleda Perez, Andres Marquez Ruiz, Cristian Neipp Lopez, Sergi Gallego Rico, Beatriz Otero Calvino, Augusto Belendez Vazquez
Universidad de Alicante. Departamento de Fisica, Ingenieria de Sistemas y Teoria de la Senal
13th International Conference on Mathematical Methods in Science and Engineering, 2013

   title={Performance analysis of SSE instructions in multi-core CPUs and GPU computing on FDTD scheme for solid and fluid vibration problems},

   author={Franc{‘e}s Monllor, Jorge and Bleda P{‘e}rez, Sergio and M{‘a}rquez Ruiz, Andr{‘e}s and Neipp L{‘o}pez, Cristian and Gallego Rico, Sergi and Otero Calvi{~n}o, Beatriz and Bel{‘e}ndez V{‘a}zquez, Augusto and others},




Download Download (PDF)   View View   Source Source   



In this work a unified treatment of solid and fluid vibration problems is developed by means of the Finite-Difference Time-Domain (FDTD). The scheme here proposed introduces a scaling factor in the velocity fields that improves the performance of the method and the vibration analysis in heterogenous media. In order to accurately reproduce the interaction of fluids and solids in FDTD both time and spatial resolutions must be reduced compared with the set up used in acoustic FDTD problems. This aspect implies the use of bigger grids and hence more time and memory resources. For reducing the time simulation costs, FDTD code has been adapted in order to exploit the resources available in modern parallel architectures. For CPUs the implicit usage of the streaming SIMD (Singe Instruction Multiple Data) extensions in multi-core CPUs has been considered. In addition, the computation has been distributed along the different cores available by means of OpenMP directives. Graphic Processing Units (GPU) have been also considered and the degree of improvement achieved by means of this parallel architecture has been compared with the highly-tuned CPU scheme by means of the relative speed up. The speed up obtained by the parallel versions implemented were up to 7 and 30 times faster than the best sequential version for CPU and GPU respectively. Results obtained with both parallel approaches demonstrate that parallel programming techniques are mandatory in solid-vibration problems with FDTD.
VN:F [1.9.22_1171]
Rating: 0.0/5 (0 votes cast)

* * *

* * *

Follow us on Twitter

HGPU group

1666 peoples are following HGPU @twitter

Like us on Facebook

HGPU group

338 people like HGPU on Facebook

* * *

Free GPU computing nodes at hgpu.org

Registered users can now run their OpenCL application at hgpu.org. We provide 1 minute of computer time per each run on two nodes with two AMD and one nVidia graphics processing units, correspondingly. There are no restrictions on the number of starts.

The platforms are

Node 1
  • GPU device 0: nVidia GeForce GTX 560 Ti 2GB, 822MHz
  • GPU device 1: AMD/ATI Radeon HD 6970 2GB, 880MHz
  • CPU: AMD Phenom II X6 @ 2.8GHz 1055T
  • RAM: 12GB
  • OS: OpenSUSE 13.1
  • SDK: nVidia CUDA Toolkit 6.5.14, AMD APP SDK 3.0
Node 2
  • GPU device 0: AMD/ATI Radeon HD 7970 3GB, 1000MHz
  • GPU device 1: AMD/ATI Radeon HD 5870 2GB, 850MHz
  • CPU: Intel Core i7-2600 @ 3.4GHz
  • RAM: 16GB
  • OS: OpenSUSE 12.3
  • SDK: AMD APP SDK 3.0

Completed OpenCL project should be uploaded via User dashboard (see instructions and example there), compilation and execution terminal output logs will be provided to the user.

The information send to hgpu.org will be treated according to our Privacy Policy

HGPU group © 2010-2015 hgpu.org

All rights belong to the respective authors

Contact us: