Tore Kasper Frederiksen, Thomas P. Jakobsen, Jesper Buus Nielsen
We present a new protocol for maliciously secure two-partycomputation based on cut-and-choose of garbled circuits using the recent idea of "forge-and-loose" which eliminates around a factor 3 of garbled circuits that needs to be constructed and evaluated. Our protocol introduces a new way to realize the "forge-and-loose" approach which avoids an auxiliary secure two-party computation […]
View View   Download Download (PDF)   
Oscar Borries, Hans Henrik Brandenborg Sorensen, Bernd Dammann, Erik Jorgensen, Peter Meincke, Stig Busk Sorensen, Per Christian Hansen
The Physical Optics approximation is a widely used asymptotic method for calculating the scattering from electrically large bodies. It requires significant computational work and little memory, and is thus well suited for application on a Graphics Processing Unit. Here, we investigate the performance of an implementation and demonstrate that while there are some implementational pitfalls, […]
View View   Download Download (PDF)   
Mohammad Zubair Ahmad
The Internet ecosystem comprising of thousands of Autonomous Systems (ASes) now include Internet eXchange Points (IXPs) as another critical component in the infrastructure. Peering plays a significant part in driving the economic growth of ASes and is contributing to a variety of structural changes in the Internet. IXPs are a primary component of this peering […]
View View   Download Download (PDF)   
Khari A. Armih
High performance architectures are increasingly heterogeneous with shared and distributed memory components, and accelerators like GPUs. Programming such architectures is complicated and performance portability is a major issue as the architectures evolve. This thesis explores the potential for algorithmic skeletons integrating a dynamically parametrised static cost model, to deliver portable performance for mostly regular data […]
View View   Download Download (PDF)   
Viragkumar N. Jagtap, Shailendra K. Mishra
Handwriting recognition is having high demand in commercial & academics. In recent years lots of good work has been done on hand written digit recognition to improve accuracy. Handwritten digit recognition system needs larger dataset and long training time to improve accuracy & reduce error rate. Training of Neural Networks for large data sets is […]
View View   Download Download (PDF)   
Bhavneet Kaur, Sonika Jindal
CBIR is the method of searching the digital images from an image database. "Content-based" means that the search analyzes the contents of the image rather than the metadata such as colours, shapes, textures, or any other information that can be derived from the image itself. The GPU is a powerful graphics engine and a highly […]
View View   Download Download (PDF)   
Nguyen Quang-Hung, Le Thanh Tan, Chiem Thach Phat, Nam Thoai
In this paper, we consider power-aware task scheduling (PATS) in HPC clouds. Users request virtual machines (VMs) to execute their tasks. Each task is executed on one single VM, and requires a fixed number of cores (i.e., processors), computing power (million instructions per second – MIPS) of each core, a fixed start time and non-preemption […]
View View   Download Download (PDF)   
Eric Papenhausen, Klaus Mueller
Graphical processing units (GPUs) have become widely adopted in the medical imaging community. The parallel SIMD nature of GPUs maps perfectly to many reconstruction algorithms. Because of this, it is relatively straightforward to parallelize common reconstruction algorithms (e.g. FDK backprojection). This means that significant performance improvements must come from careful memory optimizations, exploiting ASICs and […]
View View   Download Download (PDF)   
Fan Li, Ming-lu Jin
As a population-based algorithm, Ant Colony Optimization (ACO) is intrinsically massively parallel, and therefore it is expected to be well-suited for implementation on GPUs (Graphics Processing Units). In this paper, we present a novel ant colony optimization algorithm (called GACO), which based on Compute Unified Device Architecture (CUDA) enabled GPU. In GACO algorithm, we utilize […]
View View   Download Download (PDF)   
M.Wozniak, K.Kuznik, M. Paszynski, V. M. Calo, D. Pardo
In this paper we present computational cost estimates for parallel shared memory isogeometric multi-frontal solver. The estimates show that the ideal isogeometric shared memory parallel direct solver scales as O(p^2 log(N/p)) for one dimensional problems, O(Np^2) for two dimensional problems, and O(N^(4/3)p^2) for three dimensional problems, where N is the number of degrees of freedom, […]
View View   Download Download (PDF)   
Batliwala Saifuddin, Kadtan Lalit, Khan Mujjammil, Khandagale Pratik, S. M. Walunj
Parallel programming has become simple and reasonable with the preamble of GPGPUs. Now a day’s many programmers transfer their application to GPGPUs with the accessibility of APIs such as NVIDIA’s CUDA. But it is very tricky task to write CUDA program. Most of the industry extensively uses the immense serial C code, and they are […]
View View   Download Download (PDF)   
David Mainzer, Gabriel Zachmann
We present a novel approach to perform collision detection queries between rigid and/or deformable models. Our method can handle arbitrary deformations and even discontinuous ones. For this, we subdivide the whole scene with all objects into connected but totally independent parts by a fuzzy clustering algorithm. Following, for every part our algorithm performs a Principal […]
View View   Download Download (PDF)   
Page 1 of 43412345...102030...Last »

* * *

* * *

* * *

Free GPU computing nodes at hgpu.org

Registered users can now run their OpenCL application at hgpu.org. We provide 1 minute of computer time per each run on two nodes with two AMD and one nVidia graphics processing units, correspondingly. There are no restrictions on the number of starts.

The platforms are

Node 1
  • GPU device 0: AMD/ATI Radeon HD 5870 2GB, 850MHz
  • GPU device 1: AMD/ATI Radeon HD 6970 2GB, 880MHz
  • CPU: AMD Phenom II X6 @ 2.8GHz 1055T
  • RAM: 12GB
  • OS: OpenSUSE 11.4
  • SDK: AMD APP SDK 2.8
Node 2
  • GPU device 0: AMD/ATI Radeon HD 7970 3GB, 1000MHz
  • GPU device 1: nVidia GeForce GTX 560 Ti 2GB, 822MHz
  • CPU: Intel Core i7-2600 @ 3.4GHz
  • RAM: 16GB
  • OS: OpenSUSE 12.2
  • SDK: nVidia CUDA Toolkit 5.0.35, AMD APP SDK 2.8

Completed OpenCL project should be uploaded via User dashboard (see instructions and example there), compilation and execution terminal output logs will be provided to the user.

The information send to hgpu.org will be treated according to our Privacy Policy

HGPU group © 2010-2014 hgpu.org

All rights belong to the respective authors

Contact us:

contact@hgpu.org