Anastasia Kruchinina
Parallel computing is a topic that became very popular in the last few decades. Parallel computers are being used in many different areas of science such as astrophysics, climate modelling, quantum chemistry, fluid dynamics and medicine. Parallel programming is a type of programming where computations can be performed concurrently on different processors or devices. There […]
View View   Download Download (PDF)   
Yun Fei, Wenping Wang, Bin Wang
Nonlinear optimization is at the heart of many algorithms in engineering. Recently, due to the rise of general purpose graphics processing unit (GPGPU), it is promising to investigate the performance improvement of optimization methods after parallelized. While much has been done for simple optimization methods such as conjugate gradient, due to the strong dependencies contained, […]
Sarod Yatawatta, Sanaz Kazemi, Saleem Zaroubi
We present the GPU based acceleration of two well known nonlinear optimization routines: Levenberg-Marquardt (LM) and Limited Memory Broyden-Fletcher-Goldfarb-Shanno (LBFGS) in radio interferometric calibration. Radio interferometric calibration is a heavily compute intensive operation where the same nonlinear optimization problem has to be solved over many time intervals, with different data. We achieve a speedup of […]
View View   Download Download (PDF)   
Stefanie Hahmann, Georges-Pierre Bonneau, Sebastien Barbier, Gershon Elber, Hans Hagen
Free-Form Deformation (FFD) is a well established technique for deforming arbitrary object shapes in space. Although more recent deformation techniques have been introduced, among them skeleton-based deformation and cage-based deformation, the simple and versatile nature of FFD is a strong advantage, and justifies its presence in nowadays leading commercial geometric modeling and animation software systems. […]
View View   Download Download (PDF)   
Roberto Cavicchioli, Andrea Prearo, Riccardo Zanella, Gaetano Zanghirati, Luca Zanni
This paper explores effective algorithms for the solution of numerical nonlinear optimization problems in image restoration. The technology of modern acquisition techniques and devices most often returns data of increasing size, so we focus on the Scaled Gradient Projection algorithm, which is well suited to large-scale applications. We present its parallel implementations on different hardware, […]
View View   Download Download (PDF)   
Xueqian Zhao, Yonghe Guo, Zhuo Feng, Shiyan Hu
Decoupling capacitor (decap) placement has been widely adopted as an effective way to suppress dynamic power supply noise. Traditional decap budgeting algorithms usually explore the sensitivity-based nonlinear optimizations or conjugate gradient methods, which can be prohibitively expensive for large-scale decap budgeting problems. We present a hierarchical cross entropy (CE) optimization technique for solving the decap […]
View View   Download Download (PDF)   
Zhiyu Zeng, Xiaoji Ye, Zhuo Feng, Peng Li
Integrating a large number of on-chip voltage regulators holds the promise of solving many power delivery challenges through strong local load regulation and facilitates system-level power management. The quantitative understanding of such complex power delivery networks (PDNs) is hampered by the large network complexity and interactions between passive on-die/package-level circuits and a multitude of nonlinear […]
View View   Download Download (PDF)   
Marcus Magnor, Gordon Kindlmann, Charles Hansen, Neb Duric
From our terrestrially confined viewpoint, the actual three-dimensional shape of distant astronomical objects is, in general, very challenging to determine. For one class of astronomical objects, however, spatial structure can be recovered from conventional 2D images alone. So-called planetary nebulae (PNe) exhibit pronounced symmetry characteristics that come about due to fundamental physical processes. Making use […]
View View   Download Download (PDF)   
M. Magnor, G. Kindlmann, N. Duric, C. Hansen
Determining the three-dimensional structure of distant astronomical objects is a challenging task, given that terrestrial observations provide only one viewpoint. For this task, bipolar planetary nebulae are interesting objects of study because of their pronounced axial symmetry due to fundamental physical processes. Making use of this symmetry constraint, we present a technique to automatically recover […]
View View   Download Download (PDF)   
Weihang Zhu, J. Curry
This paper presents a massively parallel ant colony optimization – pattern search (ACO-PS) algorithm with graphics hardware acceleration on nonlinear function optimization problems. The objective of this study is to determine the effectiveness of using graphics processing units (GPU) as a hardware platform for ACO-PS. GPU, the common graphics hardware found in modern personal computers, […]
O. Schenk, M. Christen, H. Burkhart
We report on our experience with integrating and using graphics processing units (GPUs) as fast parallel floating-point co-processors to accelerate two fundamental computational scientific kernels on the GPU: sparse direct factorization and nonlinear interior-point optimization. Since a full re-implementation of these complex kernels is typically not feasible, we identify the matrix-matrix multiplication as a first […]
View View   Download Download (PDF)   

* * *

* * *

Follow us on Twitter

HGPU group

1658 peoples are following HGPU @twitter

Like us on Facebook

HGPU group

335 people like HGPU on Facebook

* * *

Free GPU computing nodes at hgpu.org

Registered users can now run their OpenCL application at hgpu.org. We provide 1 minute of computer time per each run on two nodes with two AMD and one nVidia graphics processing units, correspondingly. There are no restrictions on the number of starts.

The platforms are

Node 1
  • GPU device 0: nVidia GeForce GTX 560 Ti 2GB, 822MHz
  • GPU device 1: AMD/ATI Radeon HD 6970 2GB, 880MHz
  • CPU: AMD Phenom II X6 @ 2.8GHz 1055T
  • RAM: 12GB
  • OS: OpenSUSE 13.1
  • SDK: nVidia CUDA Toolkit 6.5.14, AMD APP SDK 3.0
Node 2
  • GPU device 0: AMD/ATI Radeon HD 7970 3GB, 1000MHz
  • GPU device 1: AMD/ATI Radeon HD 5870 2GB, 850MHz
  • CPU: Intel Core i7-2600 @ 3.4GHz
  • RAM: 16GB
  • OS: OpenSUSE 12.3
  • SDK: AMD APP SDK 3.0

Completed OpenCL project should be uploaded via User dashboard (see instructions and example there), compilation and execution terminal output logs will be provided to the user.

The information send to hgpu.org will be treated according to our Privacy Policy

HGPU group © 2010-2015 hgpu.org

All rights belong to the respective authors

Contact us: