Albert Claret Exojo
Intel co-founder Gordon E. Moore observed in 1965 that transistor density, the number of transistors that could be placed in an integrated circuit per square inch, increased exponentially, doubling roughly every two years. This would be later known as Moore’s Law, correctly predicting the trend that governed computing hardware manufacturing for the late 20th century. […]
View View   Download Download (PDF)   
Gregory Diamos
The emergence of heterogeneous and many-core architectures presents a unique opportunity to deliver order of magnitude performance increases to high performance applications by matching certain classes of algorithms to specifically tailored architectures. However, their ubiquitous adoption has been limited by a lack of programming models and management frameworks designed to reduce the high degree of […]
C. Ong, M. Weldon, D. Cyca, M. Okoniewski
In this paper,a scalable graphics processing unit (GPU) cluster solution for the acceleration of FDTD for large-scale simulations is proposed. The hardware and software implementations are described. To illustrate the speed performance of the cluster, the simulation results of a cubic resonator with PEC boundaries is presented. A realistic large-scale simulation performed using SEMCAD X […]
View View   Download Download (PDF)   
Gregory Diamos, Sudhakar Yalamanchili
The lag of parallel programming models and languages behind the advance of heterogeneous many-core processors has left a gap between the computational capability of modern systems and the ability of applications to exploit them. Emerging programming models, such as CUDA and OpenCL, force developers to explicitly partition applications into components (kernels) and assign them to […]
View View   Download Download (PDF)   
Dana A. Jacobsen, Julien C. Thibault, Inanc Senocak
Modern graphics processing units (GPUs) with many-core architectures have emerged as general-purpose parallel computing platforms that can accelerate simulation science applications tremendously. While multiGPU workstations with several TeraFLOPS of peak computing power are available to accelerate computational problems, larger problems require even more resources. Conventional clusters of central processing units (CPU) are now being augmented […]
View View   Download Download (PDF)   
Enrique S. Quintana-Orti, Francisco D. Igual, Enrique S. Quintana-Orti, Robert A. van de Geijn
In a previous PPoPP paper we showed how the FLAME methodology, combined with the SuperMatrix runtime system, yields a simple yet powerful solution for programming dense linear algebra operations on multicore platforms. In this paper we provide further evidence that this approach solves the programmability problem for this domain by targeting a more complex architecture, […]
Byunghyun Jang, David R. Kaeli, Synho Do, Homer Pien
Although iterative reconstruction techniques (IRTs) have been shown to produce images of superior quality over conventional filtered back projection (FBP) based algorithms, the use of IRT in a clinical setting has been hampered by the significant computational demands of these algorithms. In this paper we present results of our efforts to overcome this hurdle by […]
View View   Download Download (PDF)   

* * *

* * *

Follow us on Twitter

HGPU group

1655 peoples are following HGPU @twitter

Like us on Facebook

HGPU group

334 people like HGPU on Facebook

* * *

Free GPU computing nodes at hgpu.org

Registered users can now run their OpenCL application at hgpu.org. We provide 1 minute of computer time per each run on two nodes with two AMD and one nVidia graphics processing units, correspondingly. There are no restrictions on the number of starts.

The platforms are

Node 1
  • GPU device 0: nVidia GeForce GTX 560 Ti 2GB, 822MHz
  • GPU device 1: AMD/ATI Radeon HD 6970 2GB, 880MHz
  • CPU: AMD Phenom II X6 @ 2.8GHz 1055T
  • RAM: 12GB
  • OS: OpenSUSE 13.1
  • SDK: nVidia CUDA Toolkit 6.5.14, AMD APP SDK 3.0
Node 2
  • GPU device 0: AMD/ATI Radeon HD 7970 3GB, 1000MHz
  • GPU device 1: AMD/ATI Radeon HD 5870 2GB, 850MHz
  • CPU: Intel Core i7-2600 @ 3.4GHz
  • RAM: 16GB
  • OS: OpenSUSE 12.3
  • SDK: AMD APP SDK 3.0

Completed OpenCL project should be uploaded via User dashboard (see instructions and example there), compilation and execution terminal output logs will be provided to the user.

The information send to hgpu.org will be treated according to our Privacy Policy

HGPU group © 2010-2015 hgpu.org

All rights belong to the respective authors

Contact us: