7684
Richard Membarth, Frank Hannig, Jurgen Teich, Mario Korner, Wieland Eckert
To cope with the complexity of programming GPU accelerators for medical imaging computations, we developed a framework to describe image processing kernels in a domainspecific language, which is embedded into C++. The description uses decoupled access/execute metadata, which allow the programmer to specify both execution constraints and memory access patterns of kernels. A source-to-source compiler […]
View View   Download Download (PDF)   
Richard Membarth, Frank Hannig, Jurgen Teich, Mario Korner, Wieland Eckert
The development of standard processors changed in the last years moving from bigger, more complex, and faster cores to putting several more simple cores onto one chip. This changed also the way programs are written in order to leverage the processing power of multiple cores of the same processor. In the beginning, programmers had to […]
View View   Download Download (PDF)   
Richard Membarth, Frank Hannig, Jurgen Teich, Mario Korner, Wieland Eckert
In the last decade, there has been a dramatic growth in research and development of massively parallel many-core architectures like graphics hardware, both in academia and industry. This changed also the way programs are written in order to leverage the processing power of a multitude of cores on the same hardware. In the beginning, programmers […]
View View   Download Download (PDF)   
Christopher G. Baker, Michael A. Heroux, H. Carter Edwards, Alan B. Williams
Multicore nodes have become ubiquitous in just a few years. At the same time, writing portable parallel software for multicore nodes is extremely challenging. Widely available programming models such as OpenMP and Pthreads are not useful for devices such as graphics cards, and more flexible programming models such as RapidMind are only available commercially. OpenCL […]
View View   Download Download (PDF)   
Filip Vesely
The performance and the level of programmability of graphics processors (GPU) on current video cards offer new capabilities beyond the graphics applications for which they were designed. These are general-purpose computations which expose parallelism. In this thesis, I describe the iterative methods for solving sparse linear systems: the Jacobi, Gauss-Seidel, Conjugate Gradient and BiConjugate Gradient […]
View View   Download Download (PDF)   
W. B. Langdon
Limited numerical precision of nVidia GeForce 8800 GTX and other GPUs requires careful implementation of PRNGs. The Park-Miller PRNG is programmed using G80’s native Value4f floating point in RapidMind C++. Speed up is more than 40. Code is available via ftp ftp://cs.ucl.ac.uk/genetic/gp-code/random-numbers/gpu park-miller.tar.gz
W. B. Langdon
Limited numerical precision of nVidia GeForce 8800 GTX and other GPUs requires careful implementation of PRNGs. The Park-Miller PRNG is programmed using G80’s native Value4f floating point in RapidMind C++. Speed up is more than 40. Code is available via ftp cs.ucl.ac.uk genetic/gp-code/random-numbers/gpu_park-miller.tar.gz.
Pedro Trancoso, Despo Othonos, Artemakis Artemiou
Decision Support System (DSS) workloads are known to be one of the most time-consuming database workloads that processes large data sets. Traditionally, DSS queries have been accelerated using large-scale multiprocessor. The topic addressed in this work is to analyze the benefits of using high-performance/low-cost processors such as the GPUs and the Cell/BE to accelerate DSS […]
View View   Download Download (PDF)   
Virat Agarwal, Lurng-Kuo Liu, David A. Bader
High performance computing is critical for financial markets where analysts seek to accelerate complex optimizations such as pricing engines to maintain a competitive edge. In this paper we investigate the performance of financial workloads on the Sony-Toshiba- IBM Cell Broadband Engine, a heterogeneous multicore chip architected for intensive gaming applications and high performance computing. We […]
View View   Download Download (PDF)   
Justin Wilson, Manhong Dai, Elvis Jakupovic, Stanley Watson, Fan Meng
Modern video cards and game consoles typically have much better performance to price ratios than that of general purpose CPUs. The parallel processing capabilities of game hardware are well-suited for high throughput biomedical data analysis. Our initial results suggest that game hardware is a cost-effective platform for some computationally demanding bioinformatics problems.
View View   Download Download (PDF)   
William B. Langdon, Andrew P. Harrison
We demonstrate a SIMD C++ genetic programming system on a single 128 node parallel nVidia GeForce 8800 GTX GPU under RapidMind’s GPGPU Linux software by predicting ten year+ outcome of breast cancer from a dataset containing a million inputs. NCBI GEO GSE3494 contains hundreds of Affymetrix HG-U133A and HG-U133B GeneChip biopsies. Multiple GP runs each […]
William B. Langdon
A GPU is used to datamine five million correlations between probes within Affymetrix HG-U133A probesets across 6685 human tissue samples from NCBIpsilas GEO database. These concordances are used as machine learning training data for genetic programming running on a Linux PC with a RapidMind OpenGL GLSL backend. GPGPU is used to identify technological factors influencing […]
Page 1 of 212

* * *

* * *

Like us on Facebook

HGPU group

184 people like HGPU on Facebook

Follow us on Twitter

HGPU group

1311 peoples are following HGPU @twitter

* * *

Free GPU computing nodes at hgpu.org

Registered users can now run their OpenCL application at hgpu.org. We provide 1 minute of computer time per each run on two nodes with two AMD and one nVidia graphics processing units, correspondingly. There are no restrictions on the number of starts.

The platforms are

Node 1
  • GPU device 0: AMD/ATI Radeon HD 5870 2GB, 850MHz
  • GPU device 1: AMD/ATI Radeon HD 6970 2GB, 880MHz
  • CPU: AMD Phenom II X6 @ 2.8GHz 1055T
  • RAM: 12GB
  • OS: OpenSUSE 13.1
  • SDK: AMD APP SDK 2.9
Node 2
  • GPU device 0: AMD/ATI Radeon HD 7970 3GB, 1000MHz
  • GPU device 1: nVidia GeForce GTX 560 Ti 2GB, 822MHz
  • CPU: Intel Core i7-2600 @ 3.4GHz
  • RAM: 16GB
  • OS: OpenSUSE 12.2
  • SDK: nVidia CUDA Toolkit 6.0.1, AMD APP SDK 2.9

Completed OpenCL project should be uploaded via User dashboard (see instructions and example there), compilation and execution terminal output logs will be provided to the user.

The information send to hgpu.org will be treated according to our Privacy Policy

HGPU group © 2010-2014 hgpu.org

All rights belong to the respective authors

Contact us: