Shuoxin Lin
Dataflow models are valuable tools for representing, analyzing, and synthesizing embedded systems. Heterogeneous computing platforms with multi-core CPU and Graphics Processing Units (GPUs) provide a low cost platform for high performance computations. In this report, we present a dataflow based automated design framework that incorporates analysis, optimization and synthesis tools for embedded systems. Our framework […]
View View   Download Download (PDF)   
Avtech Scientific
Advanced Simulation Library is a free and open source multiphysics simulation software package and a tool for solving Partial Differential Equations. It has significant user base across many areas of engineering and science, from both industrial and academic organizations. ASL utilizes only the methods that allow efficient parallelization: Lattice Boltzmann Methods, Explicit Finite Difference, Matrix […]
Devadas Varma, Tom Feist
Data centers today are the backbone of the modern economy, from the server rooms that power small to midsize organizations to the enterprise data centers that support U.S. corporations and provide access to cloud computing services. According to the Natural Resources Defense Council, data centers are one of the largest and fastest-growing consumers of electricity […]
View View   Download Download (PDF)   
Lerato J. Mohapi, Simon Winberg, Michael Inggs
In this paper, we present a domain-specific language, referred to as OptiSDR, that matches high level digital signal processing (DSP) routines for software defined radio (SDR) to their generic parallel executable patterns targeted to heterogeneous computing architectures (HCAs). These HCAs includes a combination of hybrid GPU-CPU and DSP-FPGA architectures that are programmed using different programming […]
View View   Download Download (PDF)   
Farouk Mansouri, Sylvain Huet, Dominique Houzet
The biomedical imagery, the numeric communications, the acoustic signal processing and many others digital signal processing applications (DSP) are present more and more everyday in the numeric world. They process growing data volume which is represented with more and more accuracy, and using complex algorithms with time constraints to satisfying. Consequently, a high requirement of […]
View View   Download Download (PDF)   
Sparsh Mittal, Saket Gupta, Sudeb Dasgupta
Digital image processing(DIP) is an ever growing area with a variety of applications including medicine, video surveillance, and many more. To implement the upcoming sophisticated DIP algorithms and to process the large amount of data captured from sources such as satellites or medical instruments, intelligent high speed real-time systems have become imperative. Image processing algorithms […]
View View   Download Download (PDF)   
Lai-Huei Wang
Dataflow models are widely used for expressing the functionality of digital signal processing (DSP) applications due to their useful features, such as providing formal mechanisms for description of application functionality, imposing minimal data-dependency constraints in specifications, and exposing task and data level parallelism effectively. Due to the increased complexity of dynamics in modern DSP applications, […]
View View   Download Download (PDF)   
Li Tian, Fugen Zhou, Cai Meng
We address the problem that multicore DSP system doesn’t support OpenCL programming. We designed compiler and proposed a runtime framework for TI multicore DSP, by which OpenCL parallel program could take advantage of multicore computing resource. Firstly, we make use of the LLVM and Clang compiler front-end to achieve source-to-source translation and in the next […]
View View   Download Download (PDF)   
Li Tian, Cai Meng, Fugen Zhou
This paper addresses the problem that multiple DSP system doesn’t support OpenCL programming. With the compiler, runtime and the kernel scheduler proposed, an OpenCL application becomes portable not only between multiple CPU and GPU, but also between embedded multiple DSP systems. Firstly, the LLVM compiler was imported for source-to-source translation in which the translated source […]
View View   Download Download (PDF)   
Farouk Mansouri, Sylvain Huet, Vincent Fristot, Dominique Houzet
Nowadays computer applications are becoming heavier and require, at the same time, real-time results. The Heterogeneous clusters with their computing power represent a good solution to this request. However, it is possible that during the execution, a computing element of the cluster becomes defaulting, needs maintenance, or that the load needs to be re-balanced. In […]
View View   Download Download (PDF)   
Xin Zhou, Norihiro Tomagou, Yasuaki Ito, Koji Nakano
Embedded multicore processors represented by FPGAs and GPUs have lately attracted considerable attention for their potential computation ability and power consumption. Recent FPGAs have hundreds of embedded DSP slices and block RAMs. For example, Xilinx Virtex-6 Family FPGAs have a DSP48E1 slice, which is a configurable logic block equipped with fast multipliers, adders, pipeline registers, […]
View View   Download Download (PDF)   
Zheng Zhou
A variety of hardware platforms for signal processing has emerged, from distributed systems such as Wireless Sensor Networks (WSNs) to parallel systems such as Multicore Programmable Digital Signal Processors (PDSPs), Multicore General Purpose Processors (GPPs), and Graphics Processing Units (GPUs) to heterogeneous combinations of parallel and distributed devices. When a signal processing application is implemented […]
View View   Download Download (PDF)   
Page 1 of 41234

* * *

* * *

Follow us on Twitter

HGPU group

1666 peoples are following HGPU @twitter

Like us on Facebook

HGPU group

338 people like HGPU on Facebook

* * *

Free GPU computing nodes at hgpu.org

Registered users can now run their OpenCL application at hgpu.org. We provide 1 minute of computer time per each run on two nodes with two AMD and one nVidia graphics processing units, correspondingly. There are no restrictions on the number of starts.

The platforms are

Node 1
  • GPU device 0: nVidia GeForce GTX 560 Ti 2GB, 822MHz
  • GPU device 1: AMD/ATI Radeon HD 6970 2GB, 880MHz
  • CPU: AMD Phenom II X6 @ 2.8GHz 1055T
  • RAM: 12GB
  • OS: OpenSUSE 13.1
  • SDK: nVidia CUDA Toolkit 6.5.14, AMD APP SDK 3.0
Node 2
  • GPU device 0: AMD/ATI Radeon HD 7970 3GB, 1000MHz
  • GPU device 1: AMD/ATI Radeon HD 5870 2GB, 850MHz
  • CPU: Intel Core i7-2600 @ 3.4GHz
  • RAM: 16GB
  • OS: OpenSUSE 12.3
  • SDK: AMD APP SDK 3.0

Completed OpenCL project should be uploaded via User dashboard (see instructions and example there), compilation and execution terminal output logs will be provided to the user.

The information send to hgpu.org will be treated according to our Privacy Policy

HGPU group © 2010-2015 hgpu.org

All rights belong to the respective authors

Contact us: