10536

Performance analysis of SSE instructions in multi-core CPUs and GPU computing on FDTD scheme for solid and fluid vibration problems

Jorge Frances Monllor, Sergio Bleda Perez, Andres Marquez Ruiz, Cristian Neipp Lopez, Sergi Gallego Rico, Beatriz Otero Calvino, Augusto Belendez Vazquez
Universidad de Alicante. Departamento de Fisica, Ingenieria de Sistemas y Teoria de la Senal
13th International Conference on Mathematical Methods in Science and Engineering, 2013
@article{frances2013performance,

   title={Performance analysis of SSE instructions in multi-core CPUs and GPU computing on FDTD scheme for solid and fluid vibration problems},

   author={Franc{‘e}s Monllor, Jorge and Bleda P{‘e}rez, Sergio and M{‘a}rquez Ruiz, Andr{‘e}s and Neipp L{‘o}pez, Cristian and Gallego Rico, Sergi and Otero Calvi{~n}o, Beatriz and Bel{‘e}ndez V{‘a}zquez, Augusto and others},

   year={2013},

   publisher={CMMSE}

}

Download Download (PDF)   View View   Source Source   

916

views

In this work a unified treatment of solid and fluid vibration problems is developed by means of the Finite-Difference Time-Domain (FDTD). The scheme here proposed introduces a scaling factor in the velocity fields that improves the performance of the method and the vibration analysis in heterogenous media. In order to accurately reproduce the interaction of fluids and solids in FDTD both time and spatial resolutions must be reduced compared with the set up used in acoustic FDTD problems. This aspect implies the use of bigger grids and hence more time and memory resources. For reducing the time simulation costs, FDTD code has been adapted in order to exploit the resources available in modern parallel architectures. For CPUs the implicit usage of the streaming SIMD (Singe Instruction Multiple Data) extensions in multi-core CPUs has been considered. In addition, the computation has been distributed along the different cores available by means of OpenMP directives. Graphic Processing Units (GPU) have been also considered and the degree of improvement achieved by means of this parallel architecture has been compared with the highly-tuned CPU scheme by means of the relative speed up. The speed up obtained by the parallel versions implemented were up to 7 and 30 times faster than the best sequential version for CPU and GPU respectively. Results obtained with both parallel approaches demonstrate that parallel programming techniques are mandatory in solid-vibration problems with FDTD.
VN:F [1.9.22_1171]
Rating: 0.0/5 (0 votes cast)

Recent source codes

* * *

* * *

TwitterAPIExchange Object
(
    [oauth_access_token:TwitterAPIExchange:private] => 301967669-yDz6MrfyJFFsH1DVvrw5Xb9phx2d0DSOFuLehBGh
    [oauth_access_token_secret:TwitterAPIExchange:private] => o29ji3VLVmB6jASMqY8G7QZDCrdFmoTvCDNNUlb7s
    [consumer_key:TwitterAPIExchange:private] => TdQb63pho0ak9VevwMWpEgXAE
    [consumer_secret:TwitterAPIExchange:private] => Uq4rWz7nUnH1y6ab6uQ9xMk0KLcDrmckneEMdlq6G5E0jlQCFx
    [postfields:TwitterAPIExchange:private] => 
    [getfield:TwitterAPIExchange:private] => ?cursor=-1&screen_name=hgpu&skip_status=true&include_user_entities=false
    [oauth:protected] => Array
        (
            [oauth_consumer_key] => TdQb63pho0ak9VevwMWpEgXAE
            [oauth_nonce] => 1472067616
            [oauth_signature_method] => HMAC-SHA1
            [oauth_token] => 301967669-yDz6MrfyJFFsH1DVvrw5Xb9phx2d0DSOFuLehBGh
            [oauth_timestamp] => 1472067616
            [oauth_version] => 1.0
            [cursor] => -1
            [screen_name] => hgpu
            [skip_status] => true
            [include_user_entities] => false
            [oauth_signature] => zQgMa3hpPuM/zgr3oHyN2LdbCdQ=
        )

    [url] => https://api.twitter.com/1.1/users/show.json
)
Follow us on Facebook
Follow us on Twitter

HGPU group

1966 peoples are following HGPU @twitter

HGPU group © 2010-2016 hgpu.org

All rights belong to the respective authors

Contact us: