Mohammad Zubair Ahmad
The Internet ecosystem comprising of thousands of Autonomous Systems (ASes) now include Internet eXchange Points (IXPs) as another critical component in the infrastructure. Peering plays a significant part in driving the economic growth of ASes and is contributing to a variety of structural changes in the Internet. IXPs are a primary component of this peering […]
View View   Download Download (PDF)   
Andrew L. Beam, Alison Motsinger-Reif, Jon Doyle
Discovering causal genetic variants from large genetic association studies poses many difficult challenges. Assessing which genetic markers are involved in determining trait status is a computationally demanding task, especially in the presence of gene-gene interactions. A non-parametric Bayesian approach in the form of a Bayesian neural network is proposed for use in analyzing genetic association […]
Vivek Kulkarni
One of the major research trends currently is the evolution of heterogeneous parallel computing. GP-GPU computing is being widely used and several applications have been designed to exploit the massive parallelism that GP-GPU’s have to offer. While GPU’s have always been widely used in areas of computer vision for image processing, little has been done […]
View View   Download Download (PDF)   
Stephen Tyree, Jacob R. Gardner, Kilian Q. Weinberger, Kunal Agrawal, John Tran
In this paper, we evaluate the performance of various parallel optimization methods for Kernel Support Vector Machines on multicore CPUs and GPUs. In particular, we provide the first comparison of algorithms with explicit and implicit parallelization. Most existing parallel implementations for multi-core or GPU architectures are based on explicit parallelization of Sequential Minimal Optimization (SMO) […]
Orhan Firat, Alptekin Temizel
Parallelization of scientific problems is a challenging task which has a wide application area both on distributed programming, cloud computing and recently on GPGPU. Spectral graph partitioning is a widely used technique in many fields such as image processing, scientific computing, machine learning etc. In this study we analyze spectral graph partitioning subroutines on a […]
View View   Download Download (PDF)   
Michael Benguigui, Francoise Baude
This article presents a multi-GPU adaptation of a specific Monte Carlo and classification based method for pricing American basket options, due to Picazo. The first part relates how to combine fine and coarse-grained parallelization to price American basket options. A dynamic strategy of kernel calibration is proposed. Doing so, our implementation on a reasonable size […]
View View   Download Download (PDF)   
Andrew D. Ker
The Projected Spatial Rich Model (PSRM) generates powerful steganalysis features, but requires the calculation of tens of thousands of convolutions with image noise residuals. This makes it very slow: the reference implementation takes an impractical 20{30 minutes per 1 megapixel (Mpix) image. We present a case study which first tweaks the definition of the PSRM […]
View View   Download Download (PDF)   
Ying Zhang, Saizheng Zhang
In this paper, we introduce an optimized deep learning architecture with flexible layer structures and fast matrix operation kernels on parallel computing platform (e.g. NIVDIA’s GPU). Carefully layer-wise designed strategies are conducted to integrate different kinds of deep architectures into a uniform neural training-testing system. Our fast matrix operation kernels are implemented in deep architectures’ […]
View View   Download Download (PDF)   
Gary Macindoe
The use of linear algebra routines is fundamental to many areas of computational science, yet their implementation in software still forms the main computational bottleneck in many widely used algorithms. In machine learning and computational statistics, for example, the use of Gaussian distributions is ubiquitous, and routines for calculating the Cholesky decomposition, matrix inverse and […]
View View   Download Download (PDF)   
Ronan Collobert, Koray Kavukcuoglu, Clement Farabet
Neural networks and machine learning algorithms in general require a flexible environment where new algorithm prototypes and experiments can be set up as quickly as possible with best possible computational performance. To that end, we provide a new framework called Torch7, that is especially suited to achieve both of these competing goals. Torch7 is a […]
View View   Download Download (PDF)   
John Canny, Huasha Zhao
This paper describes recent work on the BIDMach toolkit for large-scale machine learning. BIDMach has demonstrated single-node performance that exceeds that of published cluster systems for many common machine-learning task. BIDMach makes full use of both CPU and GPU acceleration (through a sister library BIDMat), and requires only modest hardware (commodity GPUs). One of the […]
W. Liu, H. Zhang, D. Tao, Y. Wang, K. Lu
Principal component analysis (PCA) is a statistical technique commonly used in multivariate data analysis. However, PCA can be difficult to interpret and explain since the principal components (PCs) are linear combinations of the original variables. Sparse PCA (SPCA) aims to balance statistical fidelity and interpretability by approximating sparse PCs whose projections capture the maximal variance […]
View View   Download Download (PDF)   
Page 1 of 712345...Last »

* * *

* * *

* * *

Free GPU computing nodes at

Registered users can now run their OpenCL application at We provide 1 minute of computer time per each run on two nodes with two AMD and one nVidia graphics processing units, correspondingly. There are no restrictions on the number of starts.

The platforms are

Node 1
  • GPU device 0: AMD/ATI Radeon HD 5870 2GB, 850MHz
  • GPU device 1: AMD/ATI Radeon HD 6970 2GB, 880MHz
  • CPU: AMD Phenom II X6 @ 2.8GHz 1055T
  • RAM: 12GB
  • OS: OpenSUSE 11.4
  • SDK: AMD APP SDK 2.8
Node 2
  • GPU device 0: AMD/ATI Radeon HD 7970 3GB, 1000MHz
  • GPU device 1: nVidia GeForce GTX 560 Ti 2GB, 822MHz
  • CPU: Intel Core i7-2600 @ 3.4GHz
  • RAM: 16GB
  • OS: OpenSUSE 12.2
  • SDK: nVidia CUDA Toolkit 5.0.35, AMD APP SDK 2.8

Completed OpenCL project should be uploaded via User dashboard (see instructions and example there), compilation and execution terminal output logs will be provided to the user.

The information send to will be treated according to our Privacy Policy

HGPU group © 2010-2014

All rights belong to the respective authors

Contact us: