{"id":2904,"date":"2011-02-19T14:59:44","date_gmt":"2011-02-19T14:59:44","guid":{"rendered":"http:\/\/hgpu.org\/?p=2904"},"modified":"2011-02-19T14:59:44","modified_gmt":"2011-02-19T14:59:44","slug":"high-performance-relevance-vector-machine-on-gpus","status":"publish","type":"post","link":"https:\/\/hgpu.org\/?p=2904","title":{"rendered":"High Performance Relevance Vector Machine on GPUs"},"content":{"rendered":"<p>The Relevance Vector Machine (RVM) algorithm has been widely utilized in many applications, such as machine learning, image pattern recognition, and compressed sensing. However, the RVM algorithm is computationally expensive. We seek to accelerate the RVM algorithm computation for time sensitive applications by utilizing massively parallel accelerators such as GPUs. In this paper, the computation procedure of the RVM algorithm is fully analyzed. Recursive Cholesky decomposition, the key step in the RVM algorithm, is implemented on GPUs. The GPU performance is compared with a CPU using LAPACK and a hybrid system using the MAGMA library. Results show that our GPU implementation in both single and double precision is approximately 4 times faster than the CPU using LAPACK and faster than the hybrid MAGMA code when the matrix size is small.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>The Relevance Vector Machine (RVM) algorithm has been widely utilized in many applications, such as machine learning, image pattern recognition, and compressed sensing. However, the RVM algorithm is computationally expensive. We seek to accelerate the RVM algorithm computation for time sensitive applications by utilizing massively parallel accelerators such as GPUs. In this paper, the computation [&hellip;]<\/p>\n","protected":false},"author":351,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_jetpack_memberships_contains_paid_content":false,"footnotes":"","jetpack_publicize_message":"","jetpack_publicize_feature_enabled":true,"jetpack_social_post_already_shared":false,"jetpack_social_options":{"image_generator_settings":{"template":"highway","default_image_id":0,"font":"","enabled":false},"version":2}},"categories":[11,90,3],"tags":[1782,37,20,379,1793],"class_list":["post-2904","post","type-post","status-publish","format-standard","hentry","category-computer-science","category-opencl","category-paper","tag-computer-science","tag-linear-algebra","tag-nvidia","tag-nvidia-geforce-gtx-480","tag-opencl"],"views":2112,"jetpack_publicize_connections":[],"jetpack_featured_media_url":"","jetpack_sharing_enabled":true,"_links":{"self":[{"href":"https:\/\/hgpu.org\/index.php?rest_route=\/wp\/v2\/posts\/2904","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/hgpu.org\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/hgpu.org\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/hgpu.org\/index.php?rest_route=\/wp\/v2\/users\/351"}],"replies":[{"embeddable":true,"href":"https:\/\/hgpu.org\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=2904"}],"version-history":[{"count":0,"href":"https:\/\/hgpu.org\/index.php?rest_route=\/wp\/v2\/posts\/2904\/revisions"}],"wp:attachment":[{"href":"https:\/\/hgpu.org\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=2904"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/hgpu.org\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=2904"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/hgpu.org\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=2904"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}