Fast Linear Algebra on GPU
title={Fast Linear Algebra on GPU},
author={Polok, L. and Smr{v{z}}, P.},
booktitle={IEEE conference proceedings},
pages={6},
organization={IEEE Computer Society},
year={2012}
}
Tags: Computer science, CUBLAS, Linear Algebra, nVidia, nVidia GeForce GTX 260, nVidia GeForce GTX 590, OpenCL
loading...
Similar posts:
- Auto-Tuning of Level 1 and Level 2 BLAS for GPUs
- Auto-tuning Dense Vector and Matrix-Vector Operations for Fermi GPUs
- Optimization Solutions for Improving the Performance of the Parallel Reduction Algorithm Using Graphics Processing Units
- High-Performance Matrix-Vector Multiplication on the GPU
- Efficient Synchronization Primitives for GPUs
Most viewed papers (last 30 days)
- Graphics Programming on the Web WebCL Course Notes
- Simulating the universe with GPU-accelerated supercomputers: n-body methods, tests, and examples
- Secrets from the GPU
- Implementations of the FFT algorithm on GPU
- Fluid Motion Modelling Using Vortex Particle Method on GPU
- JPEG-GPU:: a GPGPU Implementation of JPEG Core Coding Systems
- Adding GPU Computing to Computer Organization Courses
- libWater: Heterogeneous Distributed Computing Made Easy
- Fast Implementation of Scale Invariant Feature Transform Based on CUDA
- Analyzing Locality of Memory References in GPU Architectures
Rating
Optimizing a Biomedical Imaging Orientation Score Framework
Graphics Programming on the Web WebCL Course Notes
High Performance GPU Accelerated Local Optimization in TSP
Duality based optical flow algorithms with applications
Adaptive Dynamic Load Balancing in Heterogeneous Multiple GPUs-CPUs Distributed Setting: Case Study of B&B Tree Search
In-Place Recursive Approach for All-Pairs Shortest Paths Problem Using OpenCL
Optimizing MapReduce for GPUs with effective shared memory usage
OpenCL parallel Processing using General Purpose Graphical Processing units - TiViPE software development
A parallel decoding algorithm of LDPC codes using CUDA
Kernelet: High-Throughput GPU Kernel Executions with Dynamic Slicing and Scheduling
Recent source codes
Events
June 20, 2013 9:00 AM - 10:00 AM PDT GPU Accelerated XenDesktop for Designers and Engineers (webinar) |
June 12, 2013 9:00 AM - 10:00 AM PDT Easily Accelerating Existing Monte Carlo Code: CVA and CCR Examples (webinar) |
June 11, 2013 10:00 AM - 11:00 AM PDT |
October 1-4, 2013 Lyon, France The 2013 International Workshop on Embedded Multicore Systems, ICPP-EMS 2013 |
November 13-15, 2013 Zhangjiajie, China 3rd International Workshop on Embedded Multi-core Computing and Applications, EMCA 2013 |
Registered users can now run their OpenCL application at hgpu.org. We provide 1 minute of computer time per each run on two nodes with two AMD and one nVidia graphics processing units, correspondingly. There are no restrictions on the number of starts.
The platforms are
- GPU device 0: AMD/ATI Radeon HD 5870 2GB, 850MHz
- GPU device 1: AMD/ATI Radeon HD 6970 2GB, 880MHz
- CPU: AMD Phenom II X6 @ 2.8GHz 1055T
- RAM: 12GB
- HDD: 2TB, Raid-0
- OS: OpenSUSE 11.4
- SDK: AMD APP SDK 2.8
- GPU device 0: AMD/ATI Radeon HD 7970 3GB, 1000MHz
- GPU device 1: nVidia GeForce GTX 560 Ti 2GB, 822MHz
- CPU: Intel Core i7-2600 @ 3.4GHz
- RAM: 16GB
- HDD: 2TB, Raid-0
- OS: OpenSUSE 12.2
- SDK: nVidia CUDA Toolkit 5.0.35, AMD APP SDK 2.8
Completed OpenCL project should be uploaded via User dashboard (see instructions and example there), compilation and execution terminal output logs will be provided to the user.
The information send to hgpu.org will be treated according to our Privacy Policy
HGPU Group © 2010-2013 hgpu.org
All rights belong to the respective authors
Contact information:
contact@hgpu.org




