Moritz Kreutzer, Jonas Thies, Melven Rohrig-Zollner, Andreas Pieper, Faisal Shahzad, Martin Galgon, Achim Basermann, Holger Fehske, Georg Hager, Gerhard Wellein
While many of the architectural details of future exascale-class high performance computer systems are still a matter of intense research, there appears to be a general consensus that they will be strongly heterogeneous, featuring "standard" as well as "accelerated" resources. Today, such resources are available as multicore processors, graphics processing units (GPUs), and other accelerators […]
Xavier Saez, Alejandro Soba, Edilberto Sanchez, Mervi Mantsinen, Jose M. Cela
PIC methods are one of the most used methods in plasma simulations. We present a comprehensible evaluation of the PIC code performance on four current parallel platforms: IBM PowerPC, Intel Nehalem (SMP), Intel Sandy Bridge (SMP) and ARM GPU. The behavior of computational algorithms and data structures are analyzed to deduce which code optimizations will […]
View View   Download Download (PDF)   
M. Pluta, B. Borkowski, I. Czajka, K. Suder-Debska
The paper presents a comparison of central processing unit (CPU) and graphics processing unit (GPU) performance in sound synthesis based on physical modeling. The goal was to achieve real-time performance with two- and three-dimensional finite difference (FD) instrument models. Two abstract instruments, a membrane and a block, were modeled and tested using a CPU and […]
View View   Download Download (PDF)   
Masato Mimura, Shinsuke Sakai, Tatsuya Kawahara
We propose an approach to reverberant speech recognition adopting deep learning in the front-end as well as back-end of a reverberant speech recognition system, and a novel method to improve the dereverberation performance of the front-end network using phone-class information. At the front-end, we adopt a deep autoencoder (DAE) for enhancing the speech feature parameters, […]
View View   Download Download (PDF)   
Tobias Winchen, Marvin Gottowik, Julian Rautenberg
The Pierre Auger Observatory is the currently largest experiment dedicated to unveil the nature and origin of the highest energetic cosmic rays. The software framework ‘Offline’ has been developed by the Pierre Auger Collaboration for joint analysis of data from different independent detector systems used in one observatory. While reconstruction modules are specific to the […]
View View   Download Download (PDF)   
David Medina
Rapid evolution of computer processor architectures has spawned multiple programming languages and standards. This thesis strives to address the challenges caused by fast and cyclical changes in programming models. The novel contribution of this thesis is the introduction of an abstract unified framework which addresses portability and performance for programming manycore devices. To test this […]
View View   Download Download (PDF)   
Luna Backes, Alejandro Rico, Bjorn Franke
Computer vision (CV) is widely expected to be the next big thing in mobile computing. The availability of a camera and a large number of sensors in mobile devices will enable CV applications that understand the environment and enhance people’s lives through augmented reality. One of the problems yet to solve is how to transfer […]
View View   Download Download (PDF)   
D. Jose Manuel Navarro Jimenez
The Department of Mechanical and Materials Engineering has developed a 2D Finite Element code based on geometry independent Cartesian grids (cgFEM) capable of solving shape optimization problems as well as making patientspecific analyses using medical images. A similar code in 3D (FEAVox) is currently under development. Both codes are implemented in MATLAB, a simple and […]
View View   Download Download (PDF)   
Ping Liu
XML has been used as a textual data format for transporting and storing information in many areas. However, the cost to process the large-scale XML file will become a serious issue for general processing methods. In this paper, we propose a design and implementation of a large-scale XML processing system on GPU cluster to address […]
View View   Download Download (PDF)   
Thijs van Wingerden
A novel approach is presented to render large voxel scenes in real-time. The approach differs from existing solutions in that a large emphasis is put on allowing the user to edit and stream large datasets. Previous solutions often use compression schemes involving hierarchical data layouts such as sparse voxel octrees that require some form of […]
View View   Download Download (PDF)   
Yuliang Pu, Jun Peng, Letian Huang, John Chen
Accurate and efficient data classification techniques are of vital importance to many problems, and are rapidly developing in recent decades. K-Nearest Neighbor algorithm (KNN), as one of the most important algorithms, is widely used in text categorization, predictive analysis, data mining and image recognition, etc. To accelerate the algorithm and to optimize the parallel implementation […]
View View   Download Download (PDF)   
Sander Lijbrink
The Xeon Phi is a coprocessor first released in 2012 by Intel. With x86 instruction set support, 60 cores and up to 2 teraflops of single-precision performance, the Xeon Phi seems promising and has gained wide interest. The world’s fastest supercomputer to date, the Tianhe-2, features the Xeon Phi, so does the recently announced 180 […]
View View   Download Download (PDF)   
Page 1 of 79312345...102030...Last »

* * *

* * *

Follow us on Twitter

HGPU group

1511 peoples are following HGPU @twitter

Like us on Facebook

HGPU group

260 people like HGPU on Facebook

* * *

Free GPU computing nodes at hgpu.org

Registered users can now run their OpenCL application at hgpu.org. We provide 1 minute of computer time per each run on two nodes with two AMD and one nVidia graphics processing units, correspondingly. There are no restrictions on the number of starts.

The platforms are

Node 1
  • GPU device 0: nVidia GeForce GTX 560 Ti 2GB, 822MHz
  • GPU device 1: AMD/ATI Radeon HD 6970 2GB, 880MHz
  • CPU: AMD Phenom II X6 @ 2.8GHz 1055T
  • RAM: 12GB
  • OS: OpenSUSE 13.1
  • SDK: nVidia CUDA Toolkit 6.5.14, AMD APP SDK 3.0
Node 2
  • GPU device 0: AMD/ATI Radeon HD 7970 3GB, 1000MHz
  • GPU device 1: AMD/ATI Radeon HD 5870 2GB, 850MHz
  • CPU: Intel Core i7-2600 @ 3.4GHz
  • RAM: 16GB
  • OS: OpenSUSE 12.3
  • SDK: AMD APP SDK 3.0

Completed OpenCL project should be uploaded via User dashboard (see instructions and example there), compilation and execution terminal output logs will be provided to the user.

The information send to hgpu.org will be treated according to our Privacy Policy

HGPU group © 2010-2015 hgpu.org

All rights belong to the respective authors

Contact us: