Jul, 22

High Performance Low Power Embedded Vision Systems

Human vision is a complex combination of physical, psychological and neurological processes that allow us to interact with our environment. We use vision effortlessly to detect, identify and track objects, to navigate and to create a conceptual map of our surroundings. The goal of computer vision is to design computer systems that are capable of […]
Jul, 20

Data Parallel Quadtree Indexing and Spatial Query Processing of Complex Polygon Data on GPUs

Fast growing computing power on commodity parallel hardware makes it both an opportunity and a challenge to use modern hardware for large-scale data management. While GPU (Graphics Processing Unit) computing is conceptually an excellent match for spatial data management which is both data and computing intensive, the complexity of multi-dimensional spatial indexing and query processing […]
Jul, 20

An execution model for adaptive load-balancing on multicore and multi-GPU systems

As computing systems increase in size and parallelism, it becomes more and more difficult to balance workload during program execution. Heterogeneous systems further complicate the situation, as their different constituent compute resources may consume work at different rates, and may have affinity for different kinds of work. Traditional approaches to load-balancing fail to fully address […]
Jul, 20

Performance-efficient mechanisms for managing irregularity in throughput processors

Recent graphics processing units (GPUs) have emerged as a promising platform for general purpose computing and have been shown to be very efficient in executing parallel applications with regular control and memory access behavior. Current GPU architectures primarily adopt the single-instruction multiple-thread (SIMT) programming model that balances programmability and hardware efficiency. With SIMT, the programmer […]
Jul, 20

Algorithms for Large-Scale Power Delivery Network Analysis on Massively Parallel Architectures

The on-chip power delivery network constitutes a vital subsystem of modern nanometer-scale integrated circuits, since it affects in a critical way the performance and correct operation of the devices. As technology scaling enters in the nanometer regime, there is an increasing need for accurate and efficient analysis of the power delivery network. The impact of […]
Jul, 20

Water simulation for cell based sandbox games

This thesis work presents a new algorithm for simulating fluid based on the Navier-Stokes equations. The algorithm is designed for cell based sandbox games where interactivity and performance are the main priorities. The algorithm enforces mass conservation conservatively instead of enforcing a divergence free velocity field. A global scale pressure model that simulates hydrostatic pressure […]
Jul, 18

Unsupervised Asset Cluster Analysis Implemented with Parallel Genetic Algorithms on the NVIDIA CUDA Platform

During times of stock market turbulence and crises, monitoring the clustering behaviour of financial instruments allows one to better understand the behaviour of the stock market and the associated systemic risks. In the study undertaken, I apply an effective and performant approach to classify data clusters in order to better understand correlations between stocks. The […]
Jul, 18

Implementation & Parallelisation of FDTD code for Electromagnetic Scattering

The present report deals with the application of the algorithm for computation of electromagnetic field components using FDTD method developed by Kane Yee, to Cartesian meshes using total field formulation. For this purpose, code has been written for electromagnetic scattering computation in C language. For generation of code, some snippets from [1] have been used. […]
Jul, 18

GPU acceleration of Runge Kutta-Fehlberg and its comparison with Dormand-Prince method

There is a significant reduction of processing time and speedup of performance in computer graphics with the emergence of Graphic Processing Units (GPUs). GPUs have been developed to surpass Central Processing Unit (CPU) in terms of performance and processing speed. This evolution has opened up a new area in computing and researches where highly parallel […]
Jul, 18

Efficient On-the-fly Category Retrieval using ConvNets and GPUs

We investigate the gains in precision and speed, that can be obtained by using Convolutional Networks (ConvNets) for on-the-fly retrieval – where classifiers are learnt at run time for a textual query from downloaded images, and used to rank large image or video datasets. We make three contributions: (i) we present an evaluation of state-of-the-art […]
Jul, 18

Suitability of NVIDIA GPUs for SKA1-Low

In this memo we investigate the applicability of NVIDIA Graphics Processing Units (GPUs) for SKA1-Low station and Central Signal Processing (CSP)-level processing. Station-level processing primarily involves generating a single station beam which will then be correlated with other beams in CSP. Fine channelisation can be performed either at the station of CSP-level, while coarse channelisation […]
Jul, 17

Efficient implementation of computationally intensive algorithms on parallel computing platforms

Two different types of computationally intensive problems have been researched to investigate the design methodology of the acceleration and to give a high-performance implementation on parallel architectures. Each problem was accelerated via a different architecture, and the results of the investigation were summarized in different thesis groups. The design methodology proposed in Thesis 1 can […]
Page 20 of 756« First...10...1819202122...304050...Last »

* * *

* * *

Like us on Facebook

HGPU group

151 people like HGPU on Facebook

Follow us on Twitter

HGPU group

1252 peoples are following HGPU @twitter

* * *

Free GPU computing nodes at hgpu.org

Registered users can now run their OpenCL application at hgpu.org. We provide 1 minute of computer time per each run on two nodes with two AMD and one nVidia graphics processing units, correspondingly. There are no restrictions on the number of starts.

The platforms are

Node 1
  • GPU device 0: AMD/ATI Radeon HD 5870 2GB, 850MHz
  • GPU device 1: AMD/ATI Radeon HD 6970 2GB, 880MHz
  • CPU: AMD Phenom II X6 @ 2.8GHz 1055T
  • RAM: 12GB
  • OS: OpenSUSE 13.1
  • SDK: AMD APP SDK 2.9
Node 2
  • GPU device 0: AMD/ATI Radeon HD 7970 3GB, 1000MHz
  • GPU device 1: nVidia GeForce GTX 560 Ti 2GB, 822MHz
  • CPU: Intel Core i7-2600 @ 3.4GHz
  • RAM: 16GB
  • OS: OpenSUSE 12.2
  • SDK: nVidia CUDA Toolkit 6.0.1, AMD APP SDK 2.9

Completed OpenCL project should be uploaded via User dashboard (see instructions and example there), compilation and execution terminal output logs will be provided to the user.

The information send to hgpu.org will be treated according to our Privacy Policy

HGPU group © 2010-2014 hgpu.org

All rights belong to the respective authors

Contact us: