Ang Li, Radu Serban, Dan Negrut
We discuss an approach for solving sparse or dense banded linear systems ${bf A} {bf x} = {bf b}$ on a Graphics Processing Unit (GPU) card. The matrix ${bf A} in {mathbb{R}}^{N times N}$ is possibly nonsymmetric and moderately large; i.e., $10000 leq N leq 500000$. The ${it split and parallelize}$ (${tt SaP}$) approach seeks […]
Andra-Ecaterina Hugo
To face the ever demanding requirements in term of accuracy and speed of scientific simulations, the High Performance community is constantly increasing the demands in term of parallelism, adding thus tremendous value to parallel libraries strongly optimized for highly complex architectures.Enabling HPC applications to perform efficiently when invoking multiple parallel libraries simultaneously is a great […]
View View   Download Download (PDF)   
Xavier Lacoste, Mathieu Faverge, Pierre Ramet, Samuel Thibault, George Bosilca
The ongoing hardware evolution exhibits an escalation in the number, as well as in the heterogeneity, of the computing resources. The pressure to maintain reasonable levels of performance and portability, forces the application developers to leave the traditional programming paradigms and explore alternative solutions. PaStiX is a parallel sparse direct solver, based on a dynamic […]
View View   Download Download (PDF)   
Kyungjoo Kim
We present a sparse direct solver using multilevel task scheduling on a modern heterogeneous compute node consisting of a multi-core host processor and multiple GPU accelerators. Our direct solver is based on the multifrontal method, which is characterized by exploiting dense subproblems (fronts) related in an assembly tree. Critical to high performance of the solver […]
Zhijun Qin, Yunhe Hou
Graphics processing units (GPU) have been investigated to release the computational capability in various scientific applications. Recent research shows that prudential consideration needs to be given to take the advantages of GPUs while avoiding the deficiency. In this paper, the impact of GPU acceleration to implicit integrators and explicit integrators in transient stability is investigated. […]
View View   Download Download (PDF)   
Xavier Lacoste, Pierre Ramet, Mathieu Faverge, Yamazaki Ichitaro, Jack Dongarra
The current trend in the high performance computing shows a dramatic increase in the number of cores on the shared memory compute nodes. Algorithms, especially those related to linear algebra, need to be adapted to these new computer architectures in order to be efficient. PASTIX is a sparse parallel direct solver, that incorporates a dynamic […]
View View   Download Download (PDF)   
Geraud P. Krawezik, Gene Poole
As hardware accelerators and especially GPUs become more and more popular to accelerate the compute intensive parts of an algorithm, standard high performance computing packages are starting to benefit from this trend. We present the first GPU acceleration of the ANSYS direct sparse solver. We explain how such a multifrontal solver may be accelerated using […]
View View   Download Download (PDF)   
O. Schenk, M. Christen, H. Burkhart
We report on our experience with integrating and using graphics processing units (GPUs) as fast parallel floating-point co-processors to accelerate two fundamental computational scientific kernels on the GPU: sparse direct factorization and nonlinear interior-point optimization. Since a full re-implementation of these complex kernels is typically not feasible, we identify the matrix-matrix multiplication as a first […]
View View   Download Download (PDF)   

* * *

* * *

Follow us on Twitter

HGPU group

1658 peoples are following HGPU @twitter

Like us on Facebook

HGPU group

335 people like HGPU on Facebook

* * *

Free GPU computing nodes at hgpu.org

Registered users can now run their OpenCL application at hgpu.org. We provide 1 minute of computer time per each run on two nodes with two AMD and one nVidia graphics processing units, correspondingly. There are no restrictions on the number of starts.

The platforms are

Node 1
  • GPU device 0: nVidia GeForce GTX 560 Ti 2GB, 822MHz
  • GPU device 1: AMD/ATI Radeon HD 6970 2GB, 880MHz
  • CPU: AMD Phenom II X6 @ 2.8GHz 1055T
  • RAM: 12GB
  • OS: OpenSUSE 13.1
  • SDK: nVidia CUDA Toolkit 6.5.14, AMD APP SDK 3.0
Node 2
  • GPU device 0: AMD/ATI Radeon HD 7970 3GB, 1000MHz
  • GPU device 1: AMD/ATI Radeon HD 5870 2GB, 850MHz
  • CPU: Intel Core i7-2600 @ 3.4GHz
  • RAM: 16GB
  • OS: OpenSUSE 12.3
  • SDK: AMD APP SDK 3.0

Completed OpenCL project should be uploaded via User dashboard (see instructions and example there), compilation and execution terminal output logs will be provided to the user.

The information send to hgpu.org will be treated according to our Privacy Policy

HGPU group © 2010-2015 hgpu.org

All rights belong to the respective authors

Contact us: