12655
Joshua Michael Pyle
Graphics Processing Units (GPUs) have been used to enhance the speed and efficiency of both data structures and algorithms alike. A common data structure used in Computer Science is the Bloom Filter, which is used in many types of applications including databases and security logging. The Bloom Filter is a lossy data structure that uses […]
View View   Download Download (PDF)   
Johannes Koster, Sven Rahmann
We present PEANUT (ParallEl AligNment UTility), a highly parallel GPU-based read mapper with several distinguishing features, including a novel q-gram index (called the q-group index) with small memory footprint built on-the-fly over the reads and the possibility to output both the best hits or all hits of a read. Designing the algorithm particularly for the […]
Amlan Chatterjee, Sridhar Radhakrishnan, John K. Antonio
The availability and utility of large numbers of Graphical Processing Units (GPUs) have enabled parallel computations using extensive multi-threading. Sequential access to global memory and contention at the size-limited shared memory have been main impediments to fully exploiting potential performance in architectures having a massive number of GPUs. After performing extensive study of data structures […]
View View   Download Download (PDF)   
David S. Hardin, Samuel S. Hardin
As Graphics Processing Units (GPUs) have gained in capability and GPU development environments have matured, developers are increasingly turning to the GPU to off-load the main host CPU of numerically-intensive, parallelizable computations. Modern GPUs feature hundreds of cores, and offer programming niceties such as double-precision floating point, and even limited recursion. This shift from CPU […]
View View   Download Download (PDF)   
Mark Joselli, Jose Ricardo Silva Junior, Marcelo Zamith, Esteban Clua, Eduardo Soluri
Simulation and visualization of particles in real-time can be a computationally intensive task. This intensity comes from diverse factories, being one of them is the O(n^2) complexity of the traversal algorithm, necessary for the proximity queries of all pair of particles that decide the need to compute collisions. Previous works reduced this complexity by considerably […]
View View   Download Download (PDF)   
Gang Liao, Qi Sun, Longfei Ma, Zhihui Qin
In this paper, a contrastive evaluation of massive parallel implementations of suffix tree and suffix array to accelerate genome sequence matching are proposed based on Intel Core i7 3770K quad-core and NVIDIA GeForce GTX680 GPU(kepler architecture). Due to the more regular execution flow of the indexed binary search algorithm, the more efficient use of the […]
View View   Download Download (PDF)   
Jiayang Jiang, Michael Mitzenmacher, Justin Thaler
The analysis of several algorithms and data structures can be framed as a peeling process on a random hypergraph: vertices with degree less than k are removed until there are no vertices of degree less than k left. The remaining hypergraph is known as the k-core. In this paper, we analyze parallel peeling processes, where […]
View View   Download Download (PDF)   
Marcel Birn, Vitaly Osipov, Peter Sanders, Christian Schulz, Nodari Sitchinava
We show that a simple algorithm for computing a matching on a graph runs in a logarithmic number of phases incurring work linear in the input size. The algorithm can be adapted to provide efficient algorithms in several models of computation, such as PRAM, External Memory, MapReduce and distributed memory models. Our CREW PRAM algorithm […]
View View   Download Download (PDF)   
Andreas Beckmann, Jaroslaw Fedorowicz, Jorg Keller, Ulrich Meyer
We describe efficient algorithms to analyze the cycle structure of the graph induced by the state transition function of the A5/1 stream cipher used in GSM mobile phones and report on the results of the implementation. The analysis is performed in five steps utilizing HPC clusters, GPGPU and external memory computation. A great reduction of […]
View View   Download Download (PDF)   
Prabhakar Misra, Mainak Chaudhuri
Graphics processing units (GPUs) have emerged as a strong candidate for high-performance computing. While regular data-parallel computations with little or no synchronization are easy to map on the GPU architectures, it is a challenge to scale up computations on dynamically changing pointer-linked data structures. The traditional lock-based implementations are known to offer poor scalability due […]
View View   Download Download (PDF)   
Mark McCartin-Lim, Andrew McGregor, Rui Wang
We introduce a new spatial data structure for high dimensional data called the emph{approximate principal direction tree} (APD tree) that adapts to the intrinsic dimension of the data. Our algorithm ensures vector-quantization accuracy similar to that of computationally-expensive PCA trees with similar time-complexity to that of lower-accuracy RP trees. APD trees use a small number […]
View View   Download Download (PDF)   
Anders Goude, Stefan Engblom
We present a highly general implementation of fast multipole methods on graphics processing units (GPUs). Our two-dimensional double precision code features an asymmetric type of adaptive space discretization leading to a particularly elegant and flexible implementation. All steps of the multipole algorithm are efficiently performed on the GPU, including the initial phase which assembles the […]
Page 1 of 512345

* * *

* * *

Like us on Facebook

HGPU group

149 people like HGPU on Facebook

Follow us on Twitter

HGPU group

1238 peoples are following HGPU @twitter

* * *

Free GPU computing nodes at hgpu.org

Registered users can now run their OpenCL application at hgpu.org. We provide 1 minute of computer time per each run on two nodes with two AMD and one nVidia graphics processing units, correspondingly. There are no restrictions on the number of starts.

The platforms are

Node 1
  • GPU device 0: AMD/ATI Radeon HD 5870 2GB, 850MHz
  • GPU device 1: AMD/ATI Radeon HD 6970 2GB, 880MHz
  • CPU: AMD Phenom II X6 @ 2.8GHz 1055T
  • RAM: 12GB
  • OS: OpenSUSE 13.1
  • SDK: AMD APP SDK 2.9
Node 2
  • GPU device 0: AMD/ATI Radeon HD 7970 3GB, 1000MHz
  • GPU device 1: nVidia GeForce GTX 560 Ti 2GB, 822MHz
  • CPU: Intel Core i7-2600 @ 3.4GHz
  • RAM: 16GB
  • OS: OpenSUSE 12.2
  • SDK: nVidia CUDA Toolkit 6.0.1, AMD APP SDK 2.9

Completed OpenCL project should be uploaded via User dashboard (see instructions and example there), compilation and execution terminal output logs will be provided to the user.

The information send to hgpu.org will be treated according to our Privacy Policy

HGPU group © 2010-2014 hgpu.org

All rights belong to the respective authors

Contact us: