{"id":2997,"date":"2011-02-27T09:15:15","date_gmt":"2011-02-27T09:15:15","guid":{"rendered":"http:\/\/hgpu.org\/?p=2997"},"modified":"2011-02-27T09:15:15","modified_gmt":"2011-02-27T09:15:15","slug":"size-matters-spacetime-tradeoffs-to-improve-gpgpu-applications-performance","status":"publish","type":"post","link":"https:\/\/hgpu.org\/?p=2997","title":{"rendered":"Size Matters: Space\/Time Tradeoffs to Improve GPGPU Applications Performance"},"content":{"rendered":"<p>GPUs offer drastically different performance characteristics compared to traditional multicore architectures. To explore the tradeoffs exposed by  this difference, we refactor MUMmer, a widely-used, highly-engineered bioinformatics application which  has both CPU- and GPU-based implementations. We synthesize our experience as three high-level guidelines to design efficient GPU-based applications. First,  minimizing the communication overheads is as important as optimizing the computation. Second, trading-off higher computational complexity for a more compact in-memory representation is a valuable technique to increase overall performance (by enabling higher parallelism levels and reducing transfer overheads). Finally, ensuring that the chosen solution entails low pre- and post-processing overheads is essential to maximize the overall performance gains. Based on these insights, MUMmerGPU++, our GPU-based design of the MUMmer sequence alignment tool, achieves, on realistic workloads, up to 4x speedup compared to a previous, highly optimized GPU port. <\/p>\n","protected":false},"excerpt":{"rendered":"<p>GPUs offer drastically different performance characteristics compared to traditional multicore architectures. To explore the tradeoffs exposed by this difference, we refactor MUMmer, a widely-used, highly-engineered bioinformatics application which has both CPU- and GPU-based implementations. We synthesize our experience as three high-level guidelines to design efficient GPU-based applications. First, minimizing the communication overheads is as important [&hellip;]<\/p>\n","protected":false},"author":351,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_jetpack_memberships_contains_paid_content":false,"footnotes":"","jetpack_publicize_message":"","jetpack_publicize_feature_enabled":true,"jetpack_social_post_already_shared":false,"jetpack_social_options":{"image_generator_settings":{"template":"highway","default_image_id":0,"font":"","enabled":false},"version":2}},"categories":[10,89,3],"tags":[123,1781,231,14,20,183,176,209,268,269],"class_list":["post-2997","post","type-post","status-publish","format-standard","hentry","category-biology","category-nvidia-cuda","category-paper","tag-bioinformatics","tag-biology","tag-computational-biology","tag-cuda","tag-nvidia","tag-nvidia-geforce-8800-gtx","tag-package","tag-sequence-alignment","tag-short-read-mapping","tag-suffix-trees"],"views":2297,"jetpack_publicize_connections":[],"jetpack_featured_media_url":"","jetpack_sharing_enabled":true,"_links":{"self":[{"href":"https:\/\/hgpu.org\/index.php?rest_route=\/wp\/v2\/posts\/2997","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/hgpu.org\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/hgpu.org\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/hgpu.org\/index.php?rest_route=\/wp\/v2\/users\/351"}],"replies":[{"embeddable":true,"href":"https:\/\/hgpu.org\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=2997"}],"version-history":[{"count":0,"href":"https:\/\/hgpu.org\/index.php?rest_route=\/wp\/v2\/posts\/2997\/revisions"}],"wp:attachment":[{"href":"https:\/\/hgpu.org\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=2997"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/hgpu.org\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=2997"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/hgpu.org\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=2997"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}