Felix Weninger, Johannes Bergmann, Bjorn Schuller
In this article, we introduce CURRENNT, an open-source parallel implementation of deep recurrent neural networks (RNNs) supporting graphics processing units (GPUs) through NVIDIA’s Computed Unified Device Architecture (CUDA). CURRENNT supports uni- and bidirectional RNNs with Long Short-Term Memory (LSTM) memory cells which overcome the vanishing gradient problem. To our knowledge, CURRENNT is the first publicly […]
Gil Shapira, Tal Hassner
The 2D Least Median of Squares (LMS) is a popular tool in robust regression because of its high breakdown point: up to half of the input data can be contaminated with outliers without affecting the accuracy of the LMS estimator. The complexity of 2D LMS estimation has been shown to be $Omega(n^2)$ where $n$ is […]
Georgios Bekiaris, Karl Glazebrook, Christopher J. Fluke, Roberto Abraham
With large-scale Integral Field Spectroscopy (IFS) surveys of thousands of galaxies currently under-way or planned, the astronomical community is in need of methods, techniques and tools that will allow the analysis of huge amounts of data. We focus on the kinematic modelling of disc galaxies and investigate the potential use of massively parallel architectures, such […]
Rahul Garg
Array-based languages such as MATLAB and Python (with NumPy) have become very popular for scientific computing. However, the performance of the implementations of these languages is often lacking. For example, some of the implementations are interpreted. Further, these languages were not designed with multi-core CPUs and GPUs in mind and thus don’t take full advantage […]
Azzam Haidar, Stanimire Tomov, Piotr Luszczek, Jack Dongarra
Embedded computing, not only in large systems like drones and hybrid vehicles, but also in small portable devices like smart phones and watches, gets more extreme to meet ever increasing demands for extended and improved functionalities. This, combined with the typical constrains for low power consumption and small sizes, makes the design of numerical libraries […]
Ang Li, Radu Serban, Dan Negrut
We discuss an approach for solving sparse or dense banded linear systems ${bf A} {bf x} = {bf b}$ on a Graphics Processing Unit (GPU) card. The matrix ${bf A} in {mathbb{R}}^{N times N}$ is possibly nonsymmetric and moderately large; i.e., $10000 leq N leq 500000$. The ${it split and parallelize}$ (${tt SaP}$) approach seeks […]
John-Alexander M. Assael
Data-efficient learning in continuous state-action spaces using high-dimensional observations remains an elusive challenge in developing fully autonomous systems. An instance of this challenge is the pixels to torques problem, which identifies key elements of an autonomous agent: autonomous thinking and decision making using sensor measurements only, learning from mistakes, and applying past experiences to novel […]
Gunther Lukat, Robi Banerjee
We present a GPU accelerated CUDA-C implementation of the Barnes Hut (BH) tree code for calculating the gravita- tional potential on octree adaptive meshes. The tree code algorithm is implemented within the FLASH4 adaptive mesh refinement (AMR) code framework and therefore fully MPI parallel. We describe the algorithm and present test results that demonstrate its […]
Vlad Olaru, Mihai Florea, Cristian Sminchisescu
This paper presents a framework that supports the implementation of parallel solutions for the widespread parametric maximum flow computational routines used in image segmentation algorithms. The framework is based on supergraphs, a special construction combining several image graphs into a larger one, and works on various architectures (multi-core or GPU), either locally or remotely in […]
Cedric Nugteren, Valeriu Codreanu
This work presents CLTune, an auto-tuner for OpenCL kernels. It evaluates and tunes kernel performance of a generic, user-defined search space of possible parametervalue combinations. Example parameters include the OpenCL workgroup size, vector data-types, tile sizes, and loop unrolling factors. CLTune can be used in the following scenarios: 1) when there are too many tunable […]
Carl Yuheng Ren, Victor Adrian Prisacariu, Ian D Reid
We introduce a parallel GPU implementation of the Simple Linear Iterative Clustering (SLIC) superpixel segmentation. Using a single graphic card, our implementation achieves speedups of up to 83x from the standard sequential implementation. Our implementation is fully compatible with the standard sequential implementation and the software is now available online and is open source.
David Hall
Syntactic parsing is one of the core tasks of natural language processing, with many appli- cations in downstream NLP tasks, from machine translation and summarization to relation extraction and coreference resolution. Parsing performance on English texts, particularly well-edited newswire text, is generally regarded as quite good. However, state-of-the-art constituency parsers produce incorrect parses for more […]
Page 1 of 8812345...102030...Last »

* * *

* * *

Follow us on Twitter

HGPU group

1585 peoples are following HGPU @twitter

Like us on Facebook

HGPU group

303 people like HGPU on Facebook

* * *

Free GPU computing nodes at hgpu.org

Registered users can now run their OpenCL application at hgpu.org. We provide 1 minute of computer time per each run on two nodes with two AMD and one nVidia graphics processing units, correspondingly. There are no restrictions on the number of starts.

The platforms are

Node 1
  • GPU device 0: nVidia GeForce GTX 560 Ti 2GB, 822MHz
  • GPU device 1: AMD/ATI Radeon HD 6970 2GB, 880MHz
  • CPU: AMD Phenom II X6 @ 2.8GHz 1055T
  • RAM: 12GB
  • OS: OpenSUSE 13.1
  • SDK: nVidia CUDA Toolkit 6.5.14, AMD APP SDK 3.0
Node 2
  • GPU device 0: AMD/ATI Radeon HD 7970 3GB, 1000MHz
  • GPU device 1: AMD/ATI Radeon HD 5870 2GB, 850MHz
  • CPU: Intel Core i7-2600 @ 3.4GHz
  • RAM: 16GB
  • OS: OpenSUSE 12.3
  • SDK: AMD APP SDK 3.0

Completed OpenCL project should be uploaded via User dashboard (see instructions and example there), compilation and execution terminal output logs will be provided to the user.

The information send to hgpu.org will be treated according to our Privacy Policy

HGPU group © 2010-2015 hgpu.org

All rights belong to the respective authors

Contact us: