12575
Ke Ding, Ying Tan
Benchmarking is key for developing and comparing optimization algorithms. In this paper, a CUDA-based real parameter optimization benchmark (cuROB) is introduced. Test functions of diverse properties are included within cuROB and implemented efficiently with CUDA. Speedup of one order of magnitude can be achieved in comparison with CPU-based benchmark of CEC’14.
Yan Zhou
The sequential Monte Carlo (smc) methods have been widely used for modern scientific computation. Bayesian model comparison has been successfully applied in many fields. Yet there have been few researches on the use of smc for the purpose of Bayesian model comparison. This thesis studies different smc strategies for Bayesian model computation. In addition, various […]
Esin Yavuz, James Turner, Thomas Nowotny
A major challenge in computational neuroscience is to achieve high performance for real-time simulations of full size brain networks. Recent advances in GPU technology provide massively parallel, low-cost and efficient hardware that is widely available on the computer market. However, the comparatively low-level programming that is necessary to create an efficient GPU-compatible implementation of neuronal […]
Ezequiel E. Ferrero, Alejandro B. Kolton, Matteo Palassini
We develop a parallel rejection algorithm to tackle the problem of low acceptance in Monte Carlo methods, and apply it to the simulation of the hopping conduction in Coulomb glasses using Graphics Processing Units, for which we also parallelize the update of local energies. In two dimensions, our parallel code achieves speedups of up to […]
Samuel W. Skillman, Michael S. Warren, Matthew J. Turk, Risa H. Wechsler, Daniel E. Holz, P. M. Sutter
The Dark Sky Simulations are an ongoing series of cosmological N-body simulations designed to provide a quantitative and accessible model of the evolution of the large-scale Universe. Such models are essential for many aspects of the study of dark matter and dark energy, since we lack a sufficiently accurate analytic model of non-linear gravitational clustering. […]
Christopher Schuster
SAT solving strategies that perform backtracking or clause learning are usually difficult to implement efficiently on massively-parallel architectures because the necessary synchronization does not scale linear with the number of processors available. Strategies like Lookahead Solving and Cube and Conquer are more promising. In order to evaluate a potential GPU implementation of Cube and Conquer, […]
Alessandro Dal Pal'u, Agostino Dovier, Andrea Formisano, Enrico Pontelli
The parallel computing power offered by Graphical Processing Units (GPUs) has been recently exploited to support general purpose applications-by exploiting the availability of general API and the SIMT-style parallelism present in several classes of problems (e.g., numerical simulations, matrix manipulations) – where relatively simple computations need to be applied to all items in large sets […]
Xinxin Mei, Kaiyong Zhao, Chengjian Liu, Xiaowen Chu
Memory access efficiency is a key factor for fully exploiting the computational power of Graphics Processing Units (GPUs). However, many details of the GPU memory hierarchy are not released by the vendors. We propose a novel fine-grained benchmarking approach and apply it on two popular GPUs, namely Fermi and Kepler, to expose the previously unknown […]
Aditya Deshpande
In earlier times, computer systems had only a single core or processor. In these computers, the number of transistors on-chip (i.e. on the processor) doubled every two years and all applications enjoyed free speedup. Subsequently, with more and more transistors being packed on-chip, power consumption became an issue, frequency scaling reached its limits and industry […]
Charalampos S. Kouzinopoulos, John-Alexander M. Assael, Themistoklis K. Pyrgiotis, Konstantinos G. Margaritis
Multiple matching algorithms are used to locate the occurrences of patterns from a finite pattern set in a large input string. Aho-Corasick and Wu-Manber, two of the most well known algorithms for multiple matching require an increased computing power, particularly in cases where large-size datasets must be processed, as is common in computational biology applications. […]
Hasse Lovgren
CONTEXT: Reinforcement Learning (RL) is a time consuming effort that requires a lot of computational power as well. There are mainly two approaches to improving RL efficiency, the theoretical mathematics and algorithmic approach or the practical implementation approach. In this study, the approaches are combined in an attempt to reduce time consumption. OBJECTIVES: We investigate […]
Bartosz D. Wozniak
Matrix multiplication is a fundamental linear algebra routine ubiquitous in all areas of science and engineering. Highly optimised BLAS libraries (cuBLAS and clBLAS on GPUs) are the most popular choices for an implementation of the General Matrix Multiply (GEMM) in software. However, performance of library GEMM is poor for small matrix sizes. In this thesis […]
Page 1 of 7412345...102030...Last »

* * *

* * *

Like us on Facebook

HGPU group

128 people like HGPU on Facebook

Follow us on Twitter

HGPU group

1194 peoples are following HGPU @twitter

Featured events

* * *

Free GPU computing nodes at hgpu.org

Registered users can now run their OpenCL application at hgpu.org. We provide 1 minute of computer time per each run on two nodes with two AMD and one nVidia graphics processing units, correspondingly. There are no restrictions on the number of starts.

The platforms are

Node 1
  • GPU device 0: AMD/ATI Radeon HD 5870 2GB, 850MHz
  • GPU device 1: AMD/ATI Radeon HD 6970 2GB, 880MHz
  • CPU: AMD Phenom II X6 @ 2.8GHz 1055T
  • RAM: 12GB
  • OS: OpenSUSE 13.1
  • SDK: AMD APP SDK 2.9
Node 2
  • GPU device 0: AMD/ATI Radeon HD 7970 3GB, 1000MHz
  • GPU device 1: nVidia GeForce GTX 560 Ti 2GB, 822MHz
  • CPU: Intel Core i7-2600 @ 3.4GHz
  • RAM: 16GB
  • OS: OpenSUSE 12.2
  • SDK: nVidia CUDA Toolkit 6.0.1, AMD APP SDK 2.9

Completed OpenCL project should be uploaded via User dashboard (see instructions and example there), compilation and execution terminal output logs will be provided to the user.

The information send to hgpu.org will be treated according to our Privacy Policy

HGPU group © 2010-2014 hgpu.org

All rights belong to the respective authors

Contact us: