Dec, 7

Rendering Volumetric Haptic Shapes in Mid-Air using Ultrasound

We present a method for creating three-dimensional haptic shapes in mid-air using focused ultrasound. This approach applies the principles of acoustic radiation force, whereby the non-linear effects of sound produce forces on the skin which are strong enough to generate tactile sensations. This mid-air haptic feedback eliminates the need for any attachment of actuators or […]
Dec, 7

Parallel GPU Processing for Fast Radio Signal Propagation Computation in GRASS-RaPlaT

Radio propagation simulation tools are important for prediction and verification of the radio signal coverage by individual transmitters or transmitter networks such as mobile phone cellular networks. In the case of a large geographic area with a relative high resolution, the simulation can become computationally demanding, taking a considerable amount of time to accomplish. Parallel […]
Dec, 7

Neural Networks through Shared Maps in Mobile Devices

We introduce a hybrid system composed of a convolutional neural network and a discrete graphical model for image recognition. This system improves upon traditional sliding window techniques for analysis of an image larger than the training data by effectively processing the full input scene through the neural network in less time. The final result is […]
Dec, 5

CUBPT: Lock-free bulk insertions to B+ tree on GPU architecture

B+-tree is one of the most widely-used index structures. To improve insertion process, several batch algorithms are proposed, which all use one thread to complete one node insertion and cannot make full use of GPU’s parallel throughput. So, a batch building and insertion method on GPU named CUBPT is proposed in this paper. During the […]
Dec, 5

Coulomb and Landau Gauge Fixing in GPUs using CUDA and MILC

In this work, we present the GPU implementation of the overrelaxation and steepest descent method with Fourier acceleration methods for Laudau and Coulomb gauge fixing using CUDA for SU(N) with N>2. A multi-GPU implementation of the overrelaxation method is also presented using MPI and CUDA. The GPU performance was measured on BlueWaters and compared against […]
Dec, 5

Software Polarization Spectrometer "PolariS"

We have developed a software-based polarization spectrometer, PolariS, to acquire full-Stokes spectra with a very high spectral resolution of 61 Hz. The primary aim of PolariS is to measure the magnetic fields in dense star-forming cores by detecting the Zeeman splitting of molecular emission lines. The spectrometer consists of a commercially available digital sampler and […]
Dec, 5

Heterogeneous High Throughput Scientific Computing with APM X-Gene and Intel Xeon Phi

Electrical power requirements will be a constraint on the future growth of Distributed High Throughput Computing (DHTC) as used by High Energy Physics. Performance-per-watt is a critical metric for the evaluation of computer architectures for cost- efficient computing. Additionally, future performance growth will come from heterogeneous, many-core, and high computing density platforms with specialized processors. […]
Dec, 5

IPMACC: Open Source OpenACC to CUDA/OpenCL Translator

In this paper we introduce IPMACC, a framework for translating OpenACC applications to CUDA or OpenCL. IPMACC is composed of set of translators translating OpenACC for C applications to CUDA or OpenCL. The framework uses the system compiler (e.g. nvcc) for generating final accelerator’s binary. The framework can be used for extending the OpenACC API, […]
Dec, 3

OpenCL Based High-Quality HEVC Motion Estimation on GPU

This paper presents a high quality H.265/HEVC motion estimation implementation with the cooperation of CPU and GPU. The data dependency from MVP (Motion Vector Predictor) restricts the degree of parallelism on GPU. To overcome the constraint from MVP, we propose to use an estimated MVP on GPU and the accurate MVP to refine the motion […]
Dec, 3

Implementation of k-Means Clustering Algorithm in CUDA

Big Data poses a very great computational challenge for programmers as well as machines as a lot of number crunching is to be done.Due to recent development in the shared memory inexpensive architecture like Graphics Processing Units (GPU), an alternative has emerged. In this paper, we target at decreasing runtime for k-Means, which is one […]
Dec, 3

Numerical cosmology on the GPU with Enzo and Ramses

A number of scientific numerical codes can currently exploit GPUs with remarkable performance. In astrophysics, Enzo and Ramses are prime examples of such applications. The two codes have been ported to GPUs adopting different strategies and programming models, Enzo adopting CUDA and Ramses using OpenACC. We describe here the different solutions used for the GPU […]
Dec, 3

24.77 Pflops on a Gravitational Tree-Code to Simulate the Milky Way Galaxy with 18600 GPUs

We have simulated, for the first time, the long term evolution of the Milky Way Galaxy using 51 billion particles on the Swiss Piz Daint supercomputer with our $N$-body gravitational tree-code Bonsai. Herein, we describe the scientific motivation and numerical algorithms. The Milky Way model was simulated for 6 billion years, during which the bar […]
Page 5 of 775« First...34567...102030...Last »

* * *

* * *

Like us on Facebook

HGPU group

194 people like HGPU on Facebook

Follow us on Twitter

HGPU group

1331 peoples are following HGPU @twitter

* * *

Free GPU computing nodes at hgpu.org

Registered users can now run their OpenCL application at hgpu.org. We provide 1 minute of computer time per each run on two nodes with two AMD and one nVidia graphics processing units, correspondingly. There are no restrictions on the number of starts.

The platforms are

Node 1
  • GPU device 0: AMD/ATI Radeon HD 5870 2GB, 850MHz
  • GPU device 1: AMD/ATI Radeon HD 6970 2GB, 880MHz
  • CPU: AMD Phenom II X6 @ 2.8GHz 1055T
  • RAM: 12GB
  • OS: OpenSUSE 13.1
  • SDK: AMD APP SDK 2.9
Node 2
  • GPU device 0: AMD/ATI Radeon HD 7970 3GB, 1000MHz
  • GPU device 1: nVidia GeForce GTX 560 Ti 2GB, 822MHz
  • CPU: Intel Core i7-2600 @ 3.4GHz
  • RAM: 16GB
  • OS: OpenSUSE 12.2
  • SDK: nVidia CUDA Toolkit 6.0.1, AMD APP SDK 2.9

Completed OpenCL project should be uploaded via User dashboard (see instructions and example there), compilation and execution terminal output logs will be provided to the user.

The information send to hgpu.org will be treated according to our Privacy Policy

HGPU group © 2010-2014 hgpu.org

All rights belong to the respective authors

Contact us: