Nov, 12

Comparison of parallel sorting algorithms

In our study we implemented and compared seven sequential and parallel sorting algorithms: bitonic sort, multistep bitonic sort, adaptive bitonic sort, merge sort, quicksort, radix sort and sample sort. Sequential algorithms were implemented on a central processing unit using C++, whereas parallel algorithms were implemented on a graphics processing unit using CUDA platform. We chose […]
Nov, 11

TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems

TensorFlow [1] is an interface for expressing machine learning algorithms, and an implementation for executing such algorithms. A computation expressed using TensorFlow can be executed with little or no change on a wide variety of heterogeneous systems, ranging from mobile devices such as phones and tablets up to large-scale distributed systems of hundreds of machines […]
Nov, 11

Autotuning OpenCL Workgroup Size for Stencil Patterns

Selecting an appropriate workgroup size is critical for the performance of OpenCL kernels, and requires knowledge of the underlying hardware, the data being operated on, and the implementation of the kernel. This makes portable performance of OpenCL programs a challenging goal, since simple heuristics and statically chosen values fail to exploit the available performance. To […]
Nov, 11

Climbing Mont Blanc – A Training Site for Energy Efficient Programming on Heterogeneous Multicore Processors

Climbing Mont Blanc (CMB) is an open online judge used for training in energy efficient programming of state-of-the-art heterogeneous multicores. It uses an Odroid-XU3 board from Hardkernel with an Exynos Octa processor and integrated power sensors. This processor is three-way heterogeneous containing 14 different cores of three different types. The board currently accepts C and […]
Nov, 11

Integrating a large-scale testing campaign in the CK framework

We consider the problem of conducting large experimental campaigns in computer science research. Most research efforts require a certain level of bookkeeping of results. This is manageable via quick, on-the-fly infrastructure implementations. However, it becomes a problem for large-scale testing initiatives, especially as the needs of the project evolve along the way. We look at […]
Nov, 11

Evaluating 3-D Stencil codes on Intel Xeon Phi: Limitations and Trade-offs

Accelerators like Intel Xeon Phi aim to fulfill the computational requirements of modern applications. A particular interest to us are those applications that are based on Stencil Computations. Stencils are finite-difference algorithms used in many scientific and engineering applications for solving large-scale and high-dimension partial differential equations. Programmability on massively parallel architectures of such kernels […]
Nov, 11

5th International Conference on Software and Computer Applications (ICSCA), 2016

Notification of Acceptance: Before February 20, 2016 The conference committees are consisted of professors, specialists and distinguished researchers from UK, Japan, Hong Kong, Singapore, Brunei Darussalam and other places. ICSCA 2016 is supported by University of Brunei Darussalam, Hong Kong University of Science and Technology, University of Bristol, Nanyang Technological University and University of Tokyo. […]
Nov, 11

International Conference on Computer Communication and Management (ICCCM), 2016

The Strong Committee Team Prof. Jalel Ben-Othman,University of Paris 13, France Prof. Alexander Balinsky, Cardiff University, United Kingdom Dr. Krzysztof Koszela, Poznan University of Life Sciences, Poland Agenda June 10, 2016 – Registration & Conference Materials Collection June 11, 2016 – Keynote Speeches & Participants’ Oral Presentation June 12, 2016 – Visit Real Peer Review […]
Nov, 11

The First IEEE International Conference on Computer Communication and the Internet (ICCCI), 2016

★ICCCI 2016 conference proceedings will be published by IEEE Conference Publication, which would be indexed by . ★NEWS: ICCCI 2016 Conference had been listed in IEEE! Online: http://www.ieee.org/conferences_events/conferences/conferencedetails/index.html?Conf_ID=37952 ★Keynote &Plenary Speakers Prof. Steven Low, IEEE Fellow, ACM Fellow, Caltech, USA Prof. Moshe Zukerman, IEEE Fellow City University of Hong Kong Prof. Rod Kennedy, IEEE Fellow, […]
Nov, 11

5th International Conference on Computer Technology and Science (ICCTS), 2016

Submission Date: Before February 5 Mainly Supported by: University of Brunei Darussalam, Brunei Darussalam. Keynote Speakers: Prof. Amine Bermak, Hong Kong University of Science and Technology, Hong Kong Prof. Dhiraj K. Pradhan, University of Bristol, UK Prof. Kot Chichung, Alex, Nanyang Technological University, Singapore Prof. Kiyoharu Aizawa, University of Tokyo, Japan Prof. Liyanage C De […]
Nov, 10

Parallelizing the Edge application for GPU-based systems using the SkePU skeleton programming library

SkePU is an auto-tunable multi-backend skeleton programming library for multi-GPU systems. SkePU is implemented as a C++ template library and has been developed at Linkoping University. In this thesis the CFD flow solver Edge has been ported to SkePU. This combines the paradigm of skeleton programming with the utilization of the unstructured grid structure used […]
Nov, 10

Evaluation of the Intel Xeon Phi and NVIDIA K80 as accelerators for two-dimensional panel codes

To predict the properties of fluid flow over a solid geometry is an important engineering problem. In many applications so-called panel methods (or boundary element methods) have become the standard approach to solve the corresponding partial differential equation. Since panel methods in two dimensions are computationally cheap, they are well suited as the inner solver […]
Page 3 of 84112345...102030...Last »

* * *

* * *

Follow us on Twitter

HGPU group

1660 peoples are following HGPU @twitter

Like us on Facebook

HGPU group

334 people like HGPU on Facebook

* * *

Free GPU computing nodes at hgpu.org

Registered users can now run their OpenCL application at hgpu.org. We provide 1 minute of computer time per each run on two nodes with two AMD and one nVidia graphics processing units, correspondingly. There are no restrictions on the number of starts.

The platforms are

Node 1
  • GPU device 0: nVidia GeForce GTX 560 Ti 2GB, 822MHz
  • GPU device 1: AMD/ATI Radeon HD 6970 2GB, 880MHz
  • CPU: AMD Phenom II X6 @ 2.8GHz 1055T
  • RAM: 12GB
  • OS: OpenSUSE 13.1
  • SDK: nVidia CUDA Toolkit 6.5.14, AMD APP SDK 3.0
Node 2
  • GPU device 0: AMD/ATI Radeon HD 7970 3GB, 1000MHz
  • GPU device 1: AMD/ATI Radeon HD 5870 2GB, 850MHz
  • CPU: Intel Core i7-2600 @ 3.4GHz
  • RAM: 16GB
  • OS: OpenSUSE 12.3
  • SDK: AMD APP SDK 3.0

Completed OpenCL project should be uploaded via User dashboard (see instructions and example there), compilation and execution terminal output logs will be provided to the user.

The information send to hgpu.org will be treated according to our Privacy Policy

HGPU group © 2010-2015 hgpu.org

All rights belong to the respective authors

Contact us: