Mar, 25

Analysis of illumination conditions at the lunar south pole using parallel computing techniques

In this Master Thesis an analysis of illumination conditions at the lunar south pole using parallel computing techniques is presented. Due to the small inclination (1.54o) of the lunar rotational axis with respect to the ecliptic plane and the topography of the lunar south pole, which allows long illumination periods, the study of illumination conditions […]
Mar, 25

The Feasibility of Using OpenCL Instead of OpenMP for Parallel CPU Programming

OpenCL, along with CUDA, is one of the main tools used to program GPGPUs. However, it allows running the same code on multi-core CPUs too, making it a rival for the long-established OpenMP. In this paper we compare OpenCL and OpenMP when developing and running compute-heavy code on a CPU. Both ease of programming and […]
Mar, 25

Data driven scheduling approach for the multi-node multi-GPU Cholesky decomposition

Recently large scale scientific computation on heterogeneous supercomputers equipped with accelerators is receiving attraction. However, traditional static job execution methods and memory management methods are insufficient in order to harness heterogeneous computing resources including memory efficiently, since they introduce larger data movement costs and lower resource usage. This paper takes the Cholesky decomposition computation, which […]
Mar, 23

Parallelization, Scalability, and Reproducibility in Next-Generation Sequencing Analysis

The analysis of next-generation sequencing (NGS) data is a major topic in bioinformatics: short reads obtained from DNA, the molecule encoding the genome of living organisms, are processed to provide insight into biological or medical questions. This thesis provides novel solutions to major topics within the analysis of NGS data, focusing on parallelization, scalability and […]
Mar, 23

Curracurrong: a stream processing system for distributed environments

Advances in technology have given rise to applications that are deployed on wireless sensor networks (WSNs), the cloud, and the Internet of things. There are many emerging applications, some of which include sensor-based monitoring, web traffic processing, and network monitoring. These applications collect large amount of data as an unbounded sequence of events and process […]
Mar, 23

GPU Kernels for High-Speed 4-Bit Astrophysical Data Processing

Interferometric radio telescopes often rely on computationally expensive O(N^2) correlation calculations; fortunately these computations map well to massively parallel accelerators such as low-cost GPUs. This paper describes the OpenCL kernels developed for the GPU based X-engine of a new hybrid FX correlator. Channelized data from the F-engine is supplied to the GPUs as 4-bit, offset-encoded […]
Mar, 23

Massively Parallel Construction of the Cell Graph

Motion planning is an important and well-studied field of robotics. A typical approach to finding a route is to construct a cell graph representing a scene and then to find a path in such a graph. In this paper we present and analyze parallel algorithms for constructing the cell graph on a SIMD-like GPU processor. […]
Mar, 23

A Financial Benchmark for GPGPU Compilation

Commodity many-core hardware is now mainstream, driven in particular by the evolution of general purpose graphics programming units (GPGPUs), but parallel programming models are lagging behind in effectively exploiting the available application parallelism. There are two principal reasons. First, real-world applications often exhibit a rich composition of nested parallelism, whose statical extraction requires a set […]
Mar, 22

PTX2Kernel: Converting PTX Code into Compilable Kernels

GPUs are now widely used as high performance general purpose computing devices. More and more applications have achieved large speedups with one or more GPUs, and the number of GPU programs is growing fast. In certain situations, the high level CUDA C code of kernels is not available, but low level PTX code can be […]
Mar, 22

Speeding Up Computer Vision Applications on Mobile Computing Platforms

Computer vision (CV) is widely expected to be the next "Big Thing" in mobile computing. For example, Google has recently announced their project "Tango", a 5-inch Android phone containing highly customized hardware and software designed to track the full 3-dimensional motion of the device as you hold it while simultaneously creating a map of the […]
Mar, 22

Raising the Bar for Using GPUs in Software Packet Processing

Numerous recent research efforts have explored the use of Graphics Processing Units (GPUs) as accelerators for software-based routing and packet handling applications, typically demonstrating throughput several times higher than using legacy code on the CPU alone. In this paper, we explore a new hypothesis about such designs: For many such applications, the benefits arise less […]
Mar, 22

Evaluating kernels on Xeon Phi to accelerate Gysela application

This work describes the challenges presented by porting parts ofthe Gysela code to the Intel Xeon Phi coprocessor, as well as techniques used for optimization, vectorization and tuning that can be applied to other applications. We evaluate the performance of somegeneric micro-benchmark on Phi versus Intel Sandy Bridge. Several interpolation kernels useful for the Gysela […]
Page 2 of 79312345...102030...Last »

* * *

* * *

Like us on Facebook

HGPU group

230 people like HGPU on Facebook

Follow us on Twitter

HGPU group

1425 peoples are following HGPU @twitter

* * *

Free GPU computing nodes at hgpu.org

Registered users can now run their OpenCL application at hgpu.org. We provide 1 minute of computer time per each run on two nodes with two AMD and one nVidia graphics processing units, correspondingly. There are no restrictions on the number of starts.

The platforms are

Node 1
  • GPU device 0: nVidia GeForce GTX 560 Ti 2GB, 822MHz
  • GPU device 1: AMD/ATI Radeon HD 6970 2GB, 880MHz
  • CPU: AMD Phenom II X6 @ 2.8GHz 1055T
  • RAM: 12GB
  • OS: OpenSUSE 13.1
  • SDK: nVidia CUDA Toolkit 6.5.14, AMD APP SDK 3.0
Node 2
  • GPU device 0: AMD/ATI Radeon HD 7970 3GB, 1000MHz
  • GPU device 1: AMD/ATI Radeon HD 5870 2GB, 850MHz
  • CPU: Intel Core i7-2600 @ 3.4GHz
  • RAM: 16GB
  • OS: OpenSUSE 12.3
  • SDK: AMD APP SDK 3.0

Completed OpenCL project should be uploaded via User dashboard (see instructions and example there), compilation and execution terminal output logs will be provided to the user.

The information send to hgpu.org will be treated according to our Privacy Policy

HGPU group © 2010-2015 hgpu.org

All rights belong to the respective authors

Contact us: