Usman Dastgeer, Christoph Kessler
An important program optimization especially for heterogeneous parallel systems is performance-aware implementation selection which is (static or dynamic) selection between multiple implementation variants for the same computation, depending on the current execution context (such as currently available resources or performance affecting parameter values). Doing it for multiple component calls inside a program while considering interferences […]
View View   Download Download (PDF)   
Pu Yang
Two recent methods for the fast simulation of the building airflow are studied: the fast fluid dynamics (FFD) algorithm and the use of graphic processing unit (GPU) for scientific computing in building engineering. A GOOGLE SketchUp plug-in for the FFD program was also developed as a model-creating tool to enhance the accessibility of the operation […]
View View   Download Download (PDF)   
Christian Azambuja Pagot
Computational fluid dynamics (CFD) methods have been employed in the studies of subjects such as aeroacoustics, gas dynamics, turbo machinery, viscoelastic fluids, among others. However, the need for accuracy and high performance resulted in methods whose solutions are becoming increasingly more complex. In this context, feature extraction and visualization methods play a key role, making […]
View View   Download Download (PDF)   
Chung-Ching Shen, Hsiang-Huang Wu, Nimish Sane, William Plishker, Shuvra S. Bhattacharyya
Development of multimedia systems on heterogeneous platforms is a challenging task with existing design tools due to a lack of rigorous integration between high level abstract modeling, and low level synthesis and analysis. In this paper, we present a new dataflow-based design tool, called the targeted dataflow interchange format (TDIF), for design, analysis, and implementation […]
View View   Download Download (PDF)   
Shuvra Bhattacharyya, Chung-Ching Shen, William Plishker, Nimish Sane, Hsiang-Huang Wu, Ruirui Gu
This report describes a new dataflow-based technology and associated design tools for high-productivity design, analysis, and optimization of layered sensing software for signal processing systems. Our approach provides novel capabilities, based on the principles of task-level dataflow analysis, for exploring and optimizing interactions across application behavior; operational context; high performance embedded processing platforms, and implementation […]
View View   Download Download (PDF)   
Ashwin Prasad, Jayvant Anantpur, R. Govindarajan
MATLAB is an array language, initially popular for rapid prototyping, but is now being increasingly used to develop production code for numerical and scientific applications. Typical MATLAB programs have abundant data parallelism. These programs also have control flow dominated scalar regions that have an impact on the program’s execution time. Today’s computer systems have tremendous […]
View View   Download Download (PDF)   
Daniel Hackenberg, Guido Juckeland, Holger Brunst
The advent of multi-core processors has made parallel computing techniques mandatory on main stream systems. With the recent rise of hardware accelerators, hybrid parallelism adds yet another dimension of complexity to the process of software development. This article presents a tool for graphical program flow analysis of hardware accelerated parallel programs. It monitors the hybrid […]
Tarun Prabhu, Shreyas Ramalingam, Matthew Might, Mary Hall
We describe, implement and benchmark EigenCFA, an algorithm for accelerating higher-order control-flow analysis (specifically, 0CFA) with a GPU. Ultimately, our program transformations, reductions and optimizations achieve a factor of 72 speedup over an optimized CPU implementation. We began our investigation with the view that GPUs accelerate high-arithmetic, data-parallel computations with a poor tolerance for branching. […]
View View   Download Download (PDF)   

* * *

* * *

Follow us on Twitter

HGPU group

1665 peoples are following HGPU @twitter

Like us on Facebook

HGPU group

339 people like HGPU on Facebook

* * *

Free GPU computing nodes at hgpu.org

Registered users can now run their OpenCL application at hgpu.org. We provide 1 minute of computer time per each run on two nodes with two AMD and one nVidia graphics processing units, correspondingly. There are no restrictions on the number of starts.

The platforms are

Node 1
  • GPU device 0: nVidia GeForce GTX 560 Ti 2GB, 822MHz
  • GPU device 1: AMD/ATI Radeon HD 6970 2GB, 880MHz
  • CPU: AMD Phenom II X6 @ 2.8GHz 1055T
  • RAM: 12GB
  • OS: OpenSUSE 13.1
  • SDK: nVidia CUDA Toolkit 6.5.14, AMD APP SDK 3.0
Node 2
  • GPU device 0: AMD/ATI Radeon HD 7970 3GB, 1000MHz
  • GPU device 1: AMD/ATI Radeon HD 5870 2GB, 850MHz
  • CPU: Intel Core i7-2600 @ 3.4GHz
  • RAM: 16GB
  • OS: OpenSUSE 12.3
  • SDK: AMD APP SDK 3.0

Completed OpenCL project should be uploaded via User dashboard (see instructions and example there), compilation and execution terminal output logs will be provided to the user.

The information send to hgpu.org will be treated according to our Privacy Policy

HGPU group © 2010-2015 hgpu.org

All rights belong to the respective authors

Contact us: