13788
Guillaume Chapuis, Hristo Djidjev
We develop an efficient parallel algorithm for answering shortest-path queries in planar graphs and implement it on a multi-node CPU/GPU clusters. The algorithm uses a divide-and-conquer approach for decomposing the input graph into small and roughly equal subgraphs and constructs a distributed data structure containing shortest distances within each of those subgraphs and between their […]
View View   Download Download (PDF)   
Francesco Lettich, Salvatore Orlando, Claudio Silvestri
The ability to timely process significant amounts of continuously updated spatial data is mandatory for an increasing number of applications. In this paper we focus on a specific data-intensive problem concerning the repeated processing of huge amounts of k nearest neighbours (k-NN) queries over massive sets of moving objects, where the spatial extents of queries […]
View View   Download Download (PDF)   
Hisham Mohamed
This thesis studies the scalability of the similarity search problem in large-scale multidimensional data. Similarity search, translating into the neighbour search problem, finds many applications for information retrieval, visualization, machine learning and data mining. The current exponential growth of data motivates the need for approximate and scalable algorithms. In most of existing algorithms and data-structures, […]
View View   Download Download (PDF)   
Saad Quader
The problem of computing the Betweenness Centrality (BC) is important in analyzing graphs in many practical applications like social networks, biological networks, transportation networks, electrical circuits, etc. Since this problem is computation intensive, researchers have been developing algorithms using high performance computing resources like supercomputers, clusters, and Graphics Processing Units (GPUs). Current GPU algorithms for […]
View View   Download Download (PDF)   
Joshua Michael Pyle
Graphics Processing Units (GPUs) have been used to enhance the speed and efficiency of both data structures and algorithms alike. A common data structure used in Computer Science is the Bloom Filter, which is used in many types of applications including databases and security logging. The Bloom Filter is a lossy data structure that uses […]
View View   Download Download (PDF)   
Johannes Koster, Sven Rahmann
We present PEANUT (ParallEl AligNment UTility), a highly parallel GPU-based read mapper with several distinguishing features, including a novel q-gram index (called the q-group index) with small memory footprint built on-the-fly over the reads and the possibility to output both the best hits or all hits of a read. Designing the algorithm particularly for the […]
Amlan Chatterjee, Sridhar Radhakrishnan, John K. Antonio
The availability and utility of large numbers of Graphical Processing Units (GPUs) have enabled parallel computations using extensive multi-threading. Sequential access to global memory and contention at the size-limited shared memory have been main impediments to fully exploiting potential performance in architectures having a massive number of GPUs. After performing extensive study of data structures […]
View View   Download Download (PDF)   
David S. Hardin, Samuel S. Hardin
As Graphics Processing Units (GPUs) have gained in capability and GPU development environments have matured, developers are increasingly turning to the GPU to off-load the main host CPU of numerically-intensive, parallelizable computations. Modern GPUs feature hundreds of cores, and offer programming niceties such as double-precision floating point, and even limited recursion. This shift from CPU […]
View View   Download Download (PDF)   
Mark Joselli, Jose Ricardo Silva Junior, Marcelo Zamith, Esteban Clua, Eduardo Soluri
Simulation and visualization of particles in real-time can be a computationally intensive task. This intensity comes from diverse factories, being one of them is the O(n^2) complexity of the traversal algorithm, necessary for the proximity queries of all pair of particles that decide the need to compute collisions. Previous works reduced this complexity by considerably […]
View View   Download Download (PDF)   
Gang Liao, Qi Sun, Longfei Ma, Zhihui Qin
In this paper, a contrastive evaluation of massive parallel implementations of suffix tree and suffix array to accelerate genome sequence matching are proposed based on Intel Core i7 3770K quad-core and NVIDIA GeForce GTX680 GPU(kepler architecture). Due to the more regular execution flow of the indexed binary search algorithm, the more efficient use of the […]
View View   Download Download (PDF)   
Jiayang Jiang, Michael Mitzenmacher, Justin Thaler
The analysis of several algorithms and data structures can be framed as a peeling process on a random hypergraph: vertices with degree less than k are removed until there are no vertices of degree less than k left. The remaining hypergraph is known as the k-core. In this paper, we analyze parallel peeling processes, where […]
View View   Download Download (PDF)   
Marcel Birn, Vitaly Osipov, Peter Sanders, Christian Schulz, Nodari Sitchinava
We show that a simple algorithm for computing a matching on a graph runs in a logarithmic number of phases incurring work linear in the input size. The algorithm can be adapted to provide efficient algorithms in several models of computation, such as PRAM, External Memory, MapReduce and distributed memory models. Our CREW PRAM algorithm […]
View View   Download Download (PDF)   
Page 1 of 512345

* * *

* * *

Like us on Facebook

HGPU group

230 people like HGPU on Facebook

Follow us on Twitter

HGPU group

1427 peoples are following HGPU @twitter

* * *

Free GPU computing nodes at hgpu.org

Registered users can now run their OpenCL application at hgpu.org. We provide 1 minute of computer time per each run on two nodes with two AMD and one nVidia graphics processing units, correspondingly. There are no restrictions on the number of starts.

The platforms are

Node 1
  • GPU device 0: nVidia GeForce GTX 560 Ti 2GB, 822MHz
  • GPU device 1: AMD/ATI Radeon HD 6970 2GB, 880MHz
  • CPU: AMD Phenom II X6 @ 2.8GHz 1055T
  • RAM: 12GB
  • OS: OpenSUSE 13.1
  • SDK: nVidia CUDA Toolkit 6.5.14, AMD APP SDK 3.0
Node 2
  • GPU device 0: AMD/ATI Radeon HD 7970 3GB, 1000MHz
  • GPU device 1: AMD/ATI Radeon HD 5870 2GB, 850MHz
  • CPU: Intel Core i7-2600 @ 3.4GHz
  • RAM: 16GB
  • OS: OpenSUSE 12.3
  • SDK: AMD APP SDK 3.0

Completed OpenCL project should be uploaded via User dashboard (see instructions and example there), compilation and execution terminal output logs will be provided to the user.

The information send to hgpu.org will be treated according to our Privacy Policy

HGPU group © 2010-2015 hgpu.org

All rights belong to the respective authors

Contact us: