Fadel Adoe, Yingke Chen, Prashant Doshi
Planning under uncertainty in multiagent settings is highly intractable because of history and plan space complexities. Probabilistic graphical models exploit the structure of the problem domain to mitigate the computational burden. In this paper, we introduce the first parallelization of planning in multiagent settings on a CPU-GPU heterogeneous system. In particular, we focus on the […]
W. P. Gaudin, A. C. Mallinson, O. Perks, J. A. Herdman, D. A. Beckingsale, J. M. Levesque, M. Boulton, S. McIntosh-Smith, S. A. Jarvis
Power constraints are forcing HPC systems to continue to increase hardware concurrency. Efficiently scaling applications on future machines will be essential for improved science and it is recognised that the "flat" MPI model will start to reach its scalability limits. The optimal approach is unknown, necessitating the use of mini-applications to rapidly evaluate new approaches. […]
Andrew L Beam, Alison Motsinger-Reif, Jon Doyle
BACKGROUND: Discovering causal genetic variants from large genetic association studies poses many difficult challenges. Assessing which genetic markers are involved in determining trait status is a computationally demanding task, especially in the presence of gene-gene interactions. RESULTS: A non-parametric Bayesian approach in the form of a Bayesian neural network is proposed for use in analyzing […]
Marvin Damschen, Christian Plessl
This paper introduces Binary Acceleration At Runtime (BAAR), an easy-to-use on-the-fly binary acceleration mechanism which aims to tackle the problem of enabling existent software to automatically utilize accelerators at runtime. BAAR is based on the LLVM Compiler Infrastructure and has a client-server architecture. The client runs the program to be accelerated in an environment which […]
Nhat Tan Nguyen Thanh
Communication remains a significant barrier to scalability on distributed-memory systems. At present, the trend in architectural system design, which focuses on enhancing node performance, exacerbates the communication problem, since the relative cost of communication grows as the computation rate increases. This problem will be more pronounced at the exascale, where computational rates will be orders […]
Hannes Vogt, Mario Schrock
We adopt CUDA-capable Graphic Processing Units (GPUs) for Landau, Coulomb and maximally Abelian gauge fixing in 3+1 dimensional SU(3) and SU(2) lattice gauge field theories. A combination of simulated annealing and overrelaxation is used to aim for the global maximum of the gauge functional. We use a fine grained degree of parallelism to achieve the […]
Weiguang Ding, Ruoyan Wang, Fei Mao, Graham Taylor
In this report, we describe a Theano-based AlexNet (Krizhevsky et al., 2012) implementation and its naive data parallelism on multiple GPUs. Our performance on 2 GPUs is comparable with the state-of-art Caffe library (Jia et al., 2014) run on 1 GPU. To the best of our knowledge, this is the first open-source Python-based AlexNet implementation […]
Edward Meeds, Remco Hendriks, Said al Faraby, Magiel Bruntink, Max Welling
With few exceptions, the field of Machine Learning (ML) research has largely ignored the browser as a computational engine. Beyond an educational resource for ML, the browser has vast potential to not only improve the state-of-the-art in ML research, but also, inexpensively and on a massive scale, to bring sophisticated ML learning and prediction to […]
Sagar Venkatesh Gubbi, Chandra Sekhar Seelamantula
Image denoising is a classical problem in image processing and has applications in areas ranging from photography to medical imaging. In this paper, we examine the denoising performance of an optimized spatially-varying Gaussian filter. The parameters of the Gaussian filter are tuned by optimizing a mean squared error estimate which is similar Stein’s Unbiased Risk […]
Izumi Mizuno, Seiji Kameno, Amane Kano, Makoto Kuroo, Fumitaka Nakamura, Noriyuki Kawaguchi, Katsunori M. Shibata, Seisuke Kuji, Nario Kuno
We have developed a software-based polarization spectrometer, PolariS, to acquire full-Stokes spectra with a very high spectral resolution of 61 Hz. The primary aim of PolariS is to measure the magnetic fields in dense star-forming cores by detecting the Zeeman splitting of molecular emission lines. The spectrometer consists of a commercially available digital sampler and […]
Mahdi S. Mohammadi, Mehdi Rezaeian
Scale Invariant Feature Transform (SIFT) is a popular image feature extraction algorithm. SIFT’s features are invariant to many image related variables including scale and change in viewpoint. Despite its broad capabilities, it is computationally expensive. This characteristic makes it hard for researchers to use SIFT in their works especially in real time application. This is […]
Nam-Luc Tran, Sabri Skhiri, Arnaud Schils, Edgar Isaac Hiroshi Leon Saiki
Over the past years there has been significant enthusiasm for development of parallel computing on Graphics Processing Units (GPU) which have now become powerful and affordable hardware equipping data centers and research clusters. Our earlier research has explored the ways to exploit the parallel compute performance of the GPU along the CPU in the same […]
Page 1 of 7812345...102030...Last »

* * *

* * *

Like us on Facebook

HGPU group

194 people like HGPU on Facebook

Follow us on Twitter

HGPU group

1331 peoples are following HGPU @twitter

* * *

Free GPU computing nodes at hgpu.org

Registered users can now run their OpenCL application at hgpu.org. We provide 1 minute of computer time per each run on two nodes with two AMD and one nVidia graphics processing units, correspondingly. There are no restrictions on the number of starts.

The platforms are

Node 1
  • GPU device 0: AMD/ATI Radeon HD 5870 2GB, 850MHz
  • GPU device 1: AMD/ATI Radeon HD 6970 2GB, 880MHz
  • CPU: AMD Phenom II X6 @ 2.8GHz 1055T
  • RAM: 12GB
  • OS: OpenSUSE 13.1
  • SDK: AMD APP SDK 2.9
Node 2
  • GPU device 0: AMD/ATI Radeon HD 7970 3GB, 1000MHz
  • GPU device 1: nVidia GeForce GTX 560 Ti 2GB, 822MHz
  • CPU: Intel Core i7-2600 @ 3.4GHz
  • RAM: 16GB
  • OS: OpenSUSE 12.2
  • SDK: nVidia CUDA Toolkit 6.0.1, AMD APP SDK 2.9

Completed OpenCL project should be uploaded via User dashboard (see instructions and example there), compilation and execution terminal output logs will be provided to the user.

The information send to hgpu.org will be treated according to our Privacy Policy

HGPU group © 2010-2014 hgpu.org

All rights belong to the respective authors

Contact us: