12887
Jie Zhu, Hai Jiang, Juanjuan Li, Erikson Hardesty, Kuan-Ching Li, Zhongwen Li
As the size of high performance applications increases, four major challenges including heterogeneity, programmability, fault resilience, and energy efficiency have arisen in the underlying distributed systems. To tackle with all of them without sacrificing performance, traditional approaches in resource utilization, task scheduling and programming paradigm should be reconsidered. While Hadoop has handled data-intensive applications well […]
View View   Download Download (PDF)   
Cedric Nugteren, Gert-Jan van den Braak, Henk Corporaal
Programming models such as CUDA and OpenCL allow the programmer to specify the independence of threads, effectively removing ordering constraints. Still, parallel architectures such as the graphics processing unit (GPU) do not exploit the potential of data-locality enabled by this independence. Therefore, programmers are required to manually perform data-locality optimisations such as memory coalescing or […]
View View   Download Download (PDF)   
Yuan Wen, Zheng Wang, Michael F.P. O'Boyle
Heterogeneous systems consisting of multiple CPUs and GPUs are increasingly attractive as platforms for high performance computing. Such platforms are usually programmed using OpenCL which provides program portability by allowing the same program to execute on different types of device. As such systems become more mainstream, they will move from application dedicated devices to platforms […]
View View   Download Download (PDF)   
Fumihiko Ino, Yosuke Oka, Kenichi Hagihara
The emergence of compute unified device architecture (CUDA), which has relieved application developers from having to understand complex graphics pipelines, has made the graphics processing unit (GPU) useful not only for graphics applications but also for general applications. In this paper, we present a cycle sharing system named GPU grid, which exploits idle GPU cycles […]
View View   Download Download (PDF)   
Pedro Alonso, Manuel F. Dolz, Francisco D. Igual, Rafael Mayo, Enrique S. Quintana-Orti
The road towards Exascale Computing requires a holistic effort to address three different challenges simultaneously: high performance, energy efficiency, and programmability. The use of runtime task schedulers to orchestrate parallel executions with minimal developer intervention has been introduced in recent years to tackle the programmability issue while maintaining, or even improving, performance. In this paper, […]
View View   Download Download (PDF)   
Li Zhen, Qiuxiao Gang, Guo Gang, Chen Bin
The graphic processing unit (GPU) gets strong computing ability with relatively low energy and money consumption, it has been widely used in the field of large-scale simulation and computation. Among which the CPU-GPU heterogeneous collaborative computing model has become an effective ways to solve the simulation performance of large-scale artificial society. But there are lots […]
View View   Download Download (PDF)   
Glenn A. Elliott, James H. Anderson
Motivated by computational capacity and power efficiency, techniques for integrating graphics processing units (GPUs) into real-time systems have become an active area of research. While much of this work has focused on single-GPU systems, multiple GPUs may be used for further benefits. Similar to CPUs in multiprocessor systems, GPUs in multi-GPU systems may be managed […]
View View   Download Download (PDF)   
Li Tian, Fugen Zhou, Cai Meng
We address the problem that multicore DSP system doesn’t support OpenCL programming. We designed compiler and proposed a runtime framework for TI multicore DSP, by which OpenCL parallel program could take advantage of multicore computing resource. Firstly, we make use of the LLVM and Clang compiler front-end to achieve source-to-source translation and in the next […]
View View   Download Download (PDF)   
Robert F. Lyerly
The world of high-performance computing has shifted from increasing single-core performance to extracting performance from heterogeneous multi- and many-core processors due to the power, memory and instruction-level parallelism walls. All trends point towards increased processor heterogeneity as a means for increasing application performance, from smartphones to servers. These various architectures are designed for different types […]
View View   Download Download (PDF)   
Martin Tillenius
In computational science, making efficient use of modern multicore based computer hardware is necessary in order to deal with complex real-life application problems. However, with increased hardware complexity, the cost in man hours of writing and re-writing software to adapt to evolving computer systems is becoming prohibitive. Task based parallel programming models aim to allow […]
Jinglian Wang, Bin Gong, Hong Liu, Shaohui Li, Juan Yi
This work presents the novel parallel evolutionary algorithm (EA) for task scheduling in distributed heterogeneous computing and grid environments, NP-hard problems with capital relevance in distributed computing. Parallelization of the biologically inspired heuristics is hierarchically designed and integrates with the two traditional parallel models (master-slave models and island models). The method has been specifically implemented […]
View View   Download Download (PDF)   
Nguyen Quang-Hung, Le Thanh Tan, Chiem Thach Phat, Nam Thoai
In this paper, we consider power-aware task scheduling (PATS) in HPC clouds. Users request virtual machines (VMs) to execute their tasks. Each task is executed on one single VM, and requires a fixed number of cores (i.e., processors), computing power (million instructions per second – MIPS) of each core, a fixed start time and non-preemption […]
View View   Download Download (PDF)   
Page 1 of 1112345...10...Last »

* * *

* * *

Like us on Facebook

HGPU group

184 people like HGPU on Facebook

Follow us on Twitter

HGPU group

1311 peoples are following HGPU @twitter

* * *

Free GPU computing nodes at hgpu.org

Registered users can now run their OpenCL application at hgpu.org. We provide 1 minute of computer time per each run on two nodes with two AMD and one nVidia graphics processing units, correspondingly. There are no restrictions on the number of starts.

The platforms are

Node 1
  • GPU device 0: AMD/ATI Radeon HD 5870 2GB, 850MHz
  • GPU device 1: AMD/ATI Radeon HD 6970 2GB, 880MHz
  • CPU: AMD Phenom II X6 @ 2.8GHz 1055T
  • RAM: 12GB
  • OS: OpenSUSE 13.1
  • SDK: AMD APP SDK 2.9
Node 2
  • GPU device 0: AMD/ATI Radeon HD 7970 3GB, 1000MHz
  • GPU device 1: nVidia GeForce GTX 560 Ti 2GB, 822MHz
  • CPU: Intel Core i7-2600 @ 3.4GHz
  • RAM: 16GB
  • OS: OpenSUSE 12.2
  • SDK: nVidia CUDA Toolkit 6.0.1, AMD APP SDK 2.9

Completed OpenCL project should be uploaded via User dashboard (see instructions and example there), compilation and execution terminal output logs will be provided to the user.

The information send to hgpu.org will be treated according to our Privacy Policy

HGPU group © 2010-2014 hgpu.org

All rights belong to the respective authors

Contact us: