Gunther Lukat, Robi Banerjee
We present a GPU accelerated CUDA-C implementation of the Barnes Hut (BH) tree code for calculating the gravita- tional potential on octree adaptive meshes. The tree code algorithm is implemented within the FLASH4 adaptive mesh refinement (AMR) code framework and therefore fully MPI parallel. We describe the algorithm and present test results that demonstrate its […]
Agostino Dovier, Andrea Formisano, Enrico Pontelli, Flavio Vella
This paper illustrates the design and implementation of a conflict-driven ASP solver that is capable of exploiting the Single-Instruction Multiple-Thread parallelism offered by General Purpose Graphical Processing Units (GPUs). Modern GPUs are multi-core platforms, providing access to large number of cores at a very low cost, but at the price of a complex architecture with […]
View View   Download Download (PDF)   
Kishalay De, Yashwant Gupta
A fully real-time coherent dedispersion system has been developed for the pulsar back-end at the Giant Metrewave Radio Telescope (GMRT). The dedispersion pipeline uses the single phased array voltage beam produced by the existing GMRT software back-end (GSB) to produce coherently dedispersed intensity output in real time, for the currently operational bandwidths of 16 MHz […]
View View   Download Download (PDF)   
Stefan Westerlund, Christopher Harris
Searching for sources of electromagnetic emission in spectral-line radio astronomy interferometric data is a computationally intensive process. Parallel programming techniques and High Performance Computing hardware may be used to improve the computational performance of a source finding program. However, it is desirable to further reduce the processing time of source finding in order to decrease […]
View View   Download Download (PDF)   
Unnikrishnan C, Y N Srikant, Rupesh Nasre
Graph algorithms are used in several domains such as social networking, biological sciences, computational geometry, and compilers, to name a few. It has been shown that they possess enough parallelism to keep several computing resources busy – even hundreds of cores on a GPU. Unfortunately, tuning their implementation for efficient execution on a particular hardware […]
View View   Download Download (PDF)   
Christopher D. Cooper, Lorena A. Barba
Interactions between surfaces and proteins occur in many vital processes and are crucial in biotechnology: the ability to control specific interactions is essential in fields like biomaterials, biomedical implants and biosensors. In the latter case, biosensor sensitivity hinges on ligand proteins adsorbing on bioactive surfaces with a favorable orientation, exposing reaction sites to target molecules. […]
Avadhesh Pratap Singh, Dhirendra Pratap Singh
K-shortest path algorithm is generalization of the shortest path algorithm. K-shortest path is used in various fields like sequence alignment problem in molecular bioinformatics, robot motion planning, path finding in gene network where speed to calculate paths plays a vital role. Parallel implementation is one of the best ways to fulfill the requirement of these […]
View View   Download Download (PDF)   
Dominik Charousset, Raphael Hiesgen, Thomas C. Schmidt
The actor model of computation has gained significant popularity over the last decade. Its high level of abstraction makes it appealing for concurrent applications in parallel and distributed systems. However, designing a real-world actor framework that subsumes full scalability, strong reliability, and high resource efficiency requires many conceptual and algorithmic additives to the original model. […]
H.A. Du Nguyen, Zaid Al-Ars, Georgios Smaragdos, Christos Strydis
The Inferior Olive (IO) in the brain, in conjunction with the cerebellum, is responsible for crucial sensorimotor-integration functions in humans. In this paper, we simulate a computationally challenging IO neuron model consisting of three compartments per neuron in a network arrangement on GPU platforms. Several GPU platforms of the two latest NVIDIA GPU architectures (Fermi, […]
View View   Download Download (PDF)   
Chok Meng Yip
Both size and computational activities of Vehicular Area Network (VANET) are growing. Simulation of VANETs not only requires the simulation of network standards, but also the mobility of nodes. Such dynamic system involves computation of node distance, routing protocols, application layer, data send, data receive, etc. The simulation model of VANET requires both hardware and […]
View View   Download Download (PDF)   
Bo Joel Svensson, Michael Vollmer, Eric Holk, Trevor L. McDonell, Ryan R. Newton
High-level domain-specific languages for array processing on the GPU are increasingly common, but they typically only run on a single GPU. As computational power is distributed across more devices, languages must target multiple devices simultaneously. To this end, we present a compositional translation that fissions data-parallel programs in the Accelerate language, allowing subsequent compiler and […]
View View   Download Download (PDF)   
Paul Harvey, Saji Hameed, Wim Vanderbauwhede
FLEXPART is a popular simulator that models the transport and diffusion of air pollutants, based on the Lagrangian approach. It is capable of regional and global simulation and supports both forward and backward runs. A complex model like this contains many calculations suitable for parallelisation. Recently, a GPU-accelerated version of the simulator (FLEXCPP) has been […]
View View   Download Download (PDF)   
Page 1 of 1312345...10...Last »

* * *

* * *

Follow us on Twitter

HGPU group

1580 peoples are following HGPU @twitter

Like us on Facebook

HGPU group

298 people like HGPU on Facebook

* * *

Free GPU computing nodes at hgpu.org

Registered users can now run their OpenCL application at hgpu.org. We provide 1 minute of computer time per each run on two nodes with two AMD and one nVidia graphics processing units, correspondingly. There are no restrictions on the number of starts.

The platforms are

Node 1
  • GPU device 0: nVidia GeForce GTX 560 Ti 2GB, 822MHz
  • GPU device 1: AMD/ATI Radeon HD 6970 2GB, 880MHz
  • CPU: AMD Phenom II X6 @ 2.8GHz 1055T
  • RAM: 12GB
  • OS: OpenSUSE 13.1
  • SDK: nVidia CUDA Toolkit 6.5.14, AMD APP SDK 3.0
Node 2
  • GPU device 0: AMD/ATI Radeon HD 7970 3GB, 1000MHz
  • GPU device 1: AMD/ATI Radeon HD 5870 2GB, 850MHz
  • CPU: Intel Core i7-2600 @ 3.4GHz
  • RAM: 16GB
  • OS: OpenSUSE 12.3
  • SDK: AMD APP SDK 3.0

Completed OpenCL project should be uploaded via User dashboard (see instructions and example there), compilation and execution terminal output logs will be provided to the user.

The information send to hgpu.org will be treated according to our Privacy Policy

HGPU group © 2010-2015 hgpu.org

All rights belong to the respective authors

Contact us: