{"id":7411,"date":"2012-04-09T13:25:51","date_gmt":"2012-04-09T10:25:51","guid":{"rendered":"http:\/\/hgpu.org\/?p=7411"},"modified":"2012-04-09T13:25:51","modified_gmt":"2012-04-09T10:25:51","slug":"new-basic-linear-algebra-methods-for-simulation-on-gpus","status":"publish","type":"post","link":"https:\/\/hgpu.org\/?p=7411","title":{"rendered":"New Basic Linear Algebra Methods for Simulation on GPUs"},"content":{"rendered":"<p>We have used Graphics Processing Units (GPUs) to accelerate the solution of the types of equations typically encountered in dynamic system simulators. Compared to commercial matrix solvers that run on a CPU, we realized speedups ranging from 5 (for system size ~700) to 460 (for system size ~5800). While calculation time for the commercial matrix solver increased with matrix size ~O(N)^2.3 , our new GPUbased Preconditioned Generalized Minimal Residual (PGMRES) technique yielded scaling as ~O(N)^1.2 . A significant component of this performance was achieved by development of new Basic Linear Algebra routines for the NVIDIA Tesla GPU that directly address characteristics typical of matrices that describe the time domain response of naturally-coupled dynamic systems.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>We have used Graphics Processing Units (GPUs) to accelerate the solution of the types of equations typically encountered in dynamic system simulators. Compared to commercial matrix solvers that run on a CPU, we realized speedups ranging from 5 (for system size ~700) to 460 (for system size ~5800). While calculation time for the commercial matrix [&hellip;]<\/p>\n","protected":false},"author":351,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":false,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_jetpack_feature_clip_id":0,"_jetpack_memberships_contains_paid_content":false,"footnotes":"","jetpack_publicize_message":"","jetpack_publicize_feature_enabled":true,"jetpack_social_post_already_shared":false,"jetpack_social_options":{"image_generator_settings":{"template":"highway","default_image_id":0,"font":"","enabled":false},"version":2},"jetpack_post_was_ever_published":false},"categories":[11,89,3],"tags":[1782,14,37,20,429],"class_list":["post-7411","post","type-post","status-publish","format-standard","hentry","category-computer-science","category-nvidia-cuda","category-paper","tag-computer-science","tag-cuda","tag-linear-algebra","tag-nvidia","tag-tesla-t10"],"views":2023,"jetpack_publicize_connections":[],"jetpack_featured_media_url":"","jetpack_sharing_enabled":true,"_links":{"self":[{"href":"https:\/\/hgpu.org\/index.php?rest_route=\/wp\/v2\/posts\/7411","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/hgpu.org\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/hgpu.org\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/hgpu.org\/index.php?rest_route=\/wp\/v2\/users\/351"}],"replies":[{"embeddable":true,"href":"https:\/\/hgpu.org\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=7411"}],"version-history":[{"count":0,"href":"https:\/\/hgpu.org\/index.php?rest_route=\/wp\/v2\/posts\/7411\/revisions"}],"wp:attachment":[{"href":"https:\/\/hgpu.org\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=7411"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/hgpu.org\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=7411"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/hgpu.org\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=7411"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}