13140
Deepthi Gummadi
In order to fast effective analysis of large complex systems, high-performance computing is essential. NVIDIA Compute Unified Device Architecture (CUDA)-assisted central processing unit (CPU) / graphics processing unit (GPU) computing platform has proven its potential to be used in high-performance computing. In CPU/GPU computing, original data and instructions are copied from CPU main memory to […]
View View   Download Download (PDF)   
T. Beier
Standard PC hardware rapidly increases in parallel computing power in form of multicore CPUs and general purpose GPUs. To take advantage of this situation it is necessary to create specialized code. This is a very time consuming and therefore an expensive task. One approach on solving this problem is the OpenCL (Open Computing Language) standard. […]
View View   Download Download (PDF)   
Nathan Yong Seng Chong
This thesis is about scalable formal verification techniques for software. A verification technique is scalable if it is able to scale to reasoning about real (rather than synthetic or toy) programs. Scalable verification techniques are essential for practical program verifiers. In this work, we consider three key characteristics of scalability: precision, performance and automation. We […]
Grzegorz Michalski, Norbert Sczygiol, Siergiei Leonov
This paper presents a simulation of the casting solidification process performed on graphics processors compatible with nVidia CUDA architecture. Indispensable for the parallel implementation of a computer simulation of the solidification process, it was necessary to modify the numerical model. The new approach shown in this paper allows the process of matrix building to be […]
View View   Download Download (PDF)   
E. Bajrovic, S. Benkner
Heterogeneous parallel architectures combining conventional multicore CPUs with GPUs and other types of accelerators promise significant performance gains compared to homogeneous systems. However, exploiting the full potential of such systems is becoming more and more challenging often forcing programmers to combine different programming models and parallelization strategies. A promising approach to coping with the increased […]
View View   Download Download (PDF)   
Zachary Langbert, Mark C. Lewis
Physically accurate hard sphere collisions are inherently sequential as the order in which collisions occur can have a significant impact on the resulting system. This makes processing hard sphere collisions on parallel hardware challenging. We present an approach to solving this problem that can be implemented using OpenCL that runs on current hardware. This approach […]
View View   Download Download (PDF)   
Ardalan Amiri Sani, Lin Zhong, Dan S. Wallach
Legacy device drivers implement both device resource management and isolation. This results in a large code base with a wide high-level interface making the driver vulnerable to security attacks. This is particularly problematic for increasingly popular accelerators like GPUs that have large, complex drivers. We solve this problem with library drivers, a new driver architecture. […]
View View   Download Download (PDF)   
Simon Jones, Matthew Studley, Alan Winfield
It is desirable for a robot to be able to run on-board simulations of itself in a model of the world to evaluate action consequences and test new controller solutions, but simulation is computationally expensive. Modern mobile System-on-Chip devices have high performance at low power consumption levels and now incorporate powerful graphics processing units, making […]
Vasco Costa, Joao M. Pereira, Joaquim A. Jorge
Global illumination techniques, such as ambient occlusion, can be performed in a physically accurate way via ray casting. However ambient occlusion rays are incoherent. This means their computation is divergent causing a degradation of rendering performance. This problem is particularly acute on the GPU stream computing architectures which have performance issues with thread divergence. We […]
View View   Download Download (PDF)   
Anders Boesen Lindbo Larsen
This technical report introduces CUDArray – a CUDA-accelerated subset of the NumPy library. The goal of CUDArray is to combine the ease of development from NumPy with the computational power of Nvidia GPUs in a lightweight and extensible framework. Since the motivation behind CUDArray is to facilitate neural network programming, CUDArray extends NumPy with a […]
Markus Steinberger, Michael Kenzel, Pedro Boechat, Bernhard Kerbl, Mark Dokter, Dieter Schmalstieg
In this paper, we present Whippletree, a novel approach to scheduling dynamic, irregular workloads on the GPU. We introduce a new programming model which offers the simplicity and expressiveness of task-based parallelism while retaining all aspects of the multilevel execution hierarchy essential to unlocking the full potential of a modern GPU. At the same time, […]
Francesco Lettich, Salvatore Orlando, Claudio Silvestri, Christian S. Jensen
The ability to timely process significant amounts of continuously updated spatial data is mandatory for an increasing number of applications. Parallelism enables such applications to face this data-intensive challenge and allows the devised systems to feature low latency and high scalability. In this paper we focus on a specific data-intensive problem, concerning the repeated processing […]
View View   Download Download (PDF)   
Page 1 of 48612345...102030...Last »

* * *

* * *

Like us on Facebook

HGPU group

184 people like HGPU on Facebook

Follow us on Twitter

HGPU group

1313 peoples are following HGPU @twitter

* * *

Free GPU computing nodes at hgpu.org

Registered users can now run their OpenCL application at hgpu.org. We provide 1 minute of computer time per each run on two nodes with two AMD and one nVidia graphics processing units, correspondingly. There are no restrictions on the number of starts.

The platforms are

Node 1
  • GPU device 0: AMD/ATI Radeon HD 5870 2GB, 850MHz
  • GPU device 1: AMD/ATI Radeon HD 6970 2GB, 880MHz
  • CPU: AMD Phenom II X6 @ 2.8GHz 1055T
  • RAM: 12GB
  • OS: OpenSUSE 13.1
  • SDK: AMD APP SDK 2.9
Node 2
  • GPU device 0: AMD/ATI Radeon HD 7970 3GB, 1000MHz
  • GPU device 1: nVidia GeForce GTX 560 Ti 2GB, 822MHz
  • CPU: Intel Core i7-2600 @ 3.4GHz
  • RAM: 16GB
  • OS: OpenSUSE 12.2
  • SDK: nVidia CUDA Toolkit 6.0.1, AMD APP SDK 2.9

Completed OpenCL project should be uploaded via User dashboard (see instructions and example there), compilation and execution terminal output logs will be provided to the user.

The information send to hgpu.org will be treated according to our Privacy Policy

HGPU group © 2010-2014 hgpu.org

All rights belong to the respective authors

Contact us: