13593
Matthias Bach
Quarks and gluons are the building blocks of all hadronic matter, like protons and neutrons. Their interaction is described by Quantum Chromodynamics (QCD), a theory under test by large scale experiments like the Large Hadron Collider (LHC) at CERN and in the future at the Facility for Antiproton and Ion Research (FAIR) at GSI. However, […]
View View   Download Download (PDF)   
Jiarui Lei, Wei Miao, Fengwen Song, Yuanlang Song, Jingyuan Sun
Video chatting is now a popular way of communication. However, poor network ruins the experience as the faces are blurred. To solve this problem, the team offers a solution to preserve the clarity of faces under limited transmission rate. In this project, the primary goal is to design a video encoder that reduces the size […]
View View   Download Download (PDF)   
Chen Liu, Benjamin Petroski, Guthrie Cordone, Gildo Torres, Stephanie Schuckers
Biometrics matching has been widely adopted as a secure way for identification and verification purpose. However, the computation demand associated with running this algorithm on a big data set poses great challenge on the underlying hardware platform. Even though modern processors are equipped with more cores and memory capacity, the software algorithm still requires careful […]
View View   Download Download (PDF)   
Alex Rubinsteyn
The Python programming language has become a popular platform for data analysis and scientific computing. To mitigate the poor performance of Python’s standard interpreter, numerically intensive computations are typically offloaded to library functions written in high-performance compiled languages such as Fortran or C. When there is no efficient library implementation available for a particular algorithm, […]
View View   Download Download (PDF)   
Raksha Patel, Isha Vajani
Face Detection finds an application in various fields in today’s world. However CPU single thread implementation of face detection consumes lot of time, and despite various optimization techniques, it performs poorly at real time. With the advent of General Purpose GPU (GPGPU) and growing support for parallel programming language like CUDA, it has become possible […]
View View   Download Download (PDF)   
Czeslaw Smutnicki, Jaroslaw Rudy, Dominik Zelazny
A new and very efficient parallel algorithm for the Fast Non-dominated Sorting of Pareto fronts is proposed. By decreasing its computational complexity, the application of the proposed method allows us to increase the speedup of the best up to now Fast and Elitist Multi-Objective Genetic Algorithm (NSGA-II) more than two orders of magnitude. Formal proofs […]
View View   Download Download (PDF)   
Jonas Martinez, Frederic Claux, Sylvain Lefebvre
In this paper, we propose to extend high quality Centroidal Voronoi Tessellation (CVT) remeshing techniques to the case of surfaces which are not defined by triangle meshes, such as implicit surfaces. Our key observation is that rasterization routines are usually available to visualize these alternative representations, most often as OpenGL shaders efficiently producing surface samples […]
View View   Download Download (PDF)   
Prakash N Ekhande, Sharad A Rumane, Mayur A Ahire
The Segmentation of text from poorly degraded document images is a very hard due to the high intravariation between the document background and the foreground text of different document images. The algorithms used for Image processing take more time for execution on a single core processor. Graphics Processing Unit (GPU) is becoming most popular due […]
View View   Download Download (PDF)   
Stanley Tsang
Two well-known bipartite graph matching algorithms, the Gale-Shapley algorithm and the Hungarian (Kuhn-Munkres) algorithm, has been ported to run on General-Purpose Graphics Processing Units (GPGPU) using kernels written with the CUDA programming model. This was done with the goal of characterising and assessing the performance and behaviour of these matching algorithms on the GPU, and […]
View View   Download Download (PDF)   
Rashid Kaleem, Sreepathi Pai, Keshav Pingali
Irregular algorithms such as Stochastic Gradient Descent (SGD) can benefit from the massive parallelism available on GPUs. However, unlike in data-parallel algorithms, synchronization patterns in SGD are quite complex. Furthermore, scheduling for scale-free graphs is challenging. This work examines several synchronization strategies for SGD, ranging from simple locking to conflict-free scheduling. We observe that static […]
View View   Download Download (PDF)   
Sergey Voronin, Per-Gunnar Martinsson
This document describes an implementation in C of a set of randomized algorithms for computing partial Singular Value Decompositions (SVDs). The techniques largely follow the prescriptions in the article "Finding structure with randomness: Probabilistic algorithms for constructing approximate matrix decompositions," N. Halko, P.G. Martinsson, J. Tropp, SIAM Review, 53(2), 2011, pp. 217-288, but with some […]
Roman Iakymchuk, David Defour, Sylvain Collange, Stef Graillat
On modern parallel architectures, floating-point computations may become non-deterministic and, therefore, non-reproducible mainly due to non-associativity of floating-point operations. We propose an algorithm to solve dense triangular systems by leveraging the standard parallel triangular solver and our, recently introduced, multi-level exact summation approach. Finally, we present implementations of the proposed fast reproducible triangular solver and […]
Page 1 of 25012345...102030...Last »

* * *

* * *

Like us on Facebook

HGPU group

218 people like HGPU on Facebook

Follow us on Twitter

HGPU group

1406 peoples are following HGPU @twitter

* * *

Free GPU computing nodes at hgpu.org

Registered users can now run their OpenCL application at hgpu.org. We provide 1 minute of computer time per each run on two nodes with two AMD and one nVidia graphics processing units, correspondingly. There are no restrictions on the number of starts.

The platforms are

Node 1
  • GPU device 0: nVidia GeForce GTX 560 Ti 2GB, 822MHz
  • GPU device 1: AMD/ATI Radeon HD 6970 2GB, 880MHz
  • CPU: AMD Phenom II X6 @ 2.8GHz 1055T
  • RAM: 12GB
  • OS: OpenSUSE 13.1
  • SDK: nVidia CUDA Toolkit 6.5.14, AMD APP SDK 3.0
Node 2
  • GPU device 0: AMD/ATI Radeon HD 7970 3GB, 1000MHz
  • GPU device 1: AMD/ATI Radeon HD 5870 2GB, 850MHz
  • CPU: Intel Core i7-2600 @ 3.4GHz
  • RAM: 16GB
  • OS: OpenSUSE 12.2
  • SDK: AMD APP SDK 2.9

Completed OpenCL project should be uploaded via User dashboard (see instructions and example there), compilation and execution terminal output logs will be provided to the user.

The information send to hgpu.org will be treated according to our Privacy Policy

HGPU group © 2010-2015 hgpu.org

All rights belong to the respective authors

Contact us: