12802
Matthaus Wander, Lorenz Schwittmann, Christopher Boelmann, Torben Weis
When a client queries for a non-existent name in the Domain Name System (DNS), the server responds with a negative answer. With the DNS Security Extensions (DNSSEC), the server can either use NSEC or NSEC3 for authenticated negative answers. NSEC3 claims to protect DNSSEC servers against domain enumeration, but incurs significant CPU and bandwidth overhead. […]
Yuan Wen, Zheng Wang, Michael F.P. O'Boyle
Heterogeneous systems consisting of multiple CPUs and GPUs are increasingly attractive as platforms for high performance computing. Such platforms are usually programmed using OpenCL which provides program portability by allowing the same program to execute on different types of device. As such systems become more mainstream, they will move from application dedicated devices to platforms […]
View View   Download Download (PDF)   
Dale Tristram, Karen Bradshaw
General-purpose computation on graphics processing units (GPGPU) has great potential to accelerate many scientific models and algorithms. However, some problems are considerably more difficult to accelerate than others, and it may be challenging for those new to GPGPU to ascertain the difficulty of accelerating a particular problem. Through what was learned in the acceleration of […]
View View   Download Download (PDF)   
Thomas R. W. Scogland, Wu-chun Feng
As core counts increase and as heterogeneity becomes more common in parallel computing, we face the prospect of programming hundreds or even thousands of concurrent threads in a single shared-memory system. At these scales, even highly-efficient concurrent algorithms and data structures can become bottlenecks, unless they are designed from the ground up with throughput as […]
View View   Download Download (PDF)   
Olav Aanes Fagerlund, Takeshi Kitayama, Gaku Hashimoto, Hiroshi Okuda
In the finite element method simulation we often deal with large sparse matrices. Sparse matrix-vector multiplication (SpMV) is of high importance for iterative solvers. During the solver stage, most of the time is in fact spent in the SpMV routine. The SpMV routine is highly memory-bound; the processor spends much time waiting for the needed […]
View View   Download Download (PDF)   
A. Gorobets, F.X. Trias, R. Borrell, G. Oyarzun, A. Oliva
The purpose of the work is twofold. Firstly, it is devoted to the development of efficient parallel algorithms for large-scale simulations of turbulent flows on different supercomputer architectures. It reports experience with massively-parallel accelerators including graphics processing units of AMD and NVIDIA and Intel Xeon Phi coprocessors. Secondly, it introduces new series of direct numerical […]
View View   Download Download (PDF)   
Jianbin Fang, Henk Sips, Ana Lucia Varbanescu
Due to the increasing complexity of multi/many-core architectures (with their mix of caches and scratch-pad memories) and applications (with different memory access patterns), the performance of many workloads becomes increasingly variable. In this work, we address one of the main causes for this performance variability: the efficiency of the memory system. Specifically, based on an […]
Ru Zhu
A finite-difference Micromagnetic solver is presented utilizing the C++ Accelerated Massive Parallelism (C++ AMP). The high speed performance of a single Graphics Processing Unit (GPU) is demonstrated compared to a typical CPU-based solver. The speed-up of GPU to CPU is shown to be greater than 100 for problems with larger sizes. This solver is based […]
View View   Download Download (PDF)   
Konstantinos Krommydas, Wu-chun Feng, Muhsen Owaida, Christos D. Antonopoulos, Nikolaos Bellas
The proliferation of heterogeneous computing platforms presents the parallel computing community with new challenges. One such challenge entails evaluating the efficacy of such parallel architectures and identifying the architectural innovations that ultimately benefit applications. To address this challenge, we need benchmarks that capture the execution patterns (i.e., dwarfs or motifs) of applications, both present and […]
Faiz Khan, Vincent Foley-Bourgon, Sujay Kathrotia, Erick Lavoie, Laurie Hendren
From its modest beginnings as a tool to validate forms, JavaScript is now an industrial-strength language used to power online applications such as spreadsheets, IDEs, image editors and even 3D games. Since all modern web browsers support JavaScript, it provides a medium that is both easy to distribute for developers and easy to access for […]
Matthew Doerksen
Heterogeneous multi-core architectures have a higher performance/power ratio than traditional homogeneous architectures. Due to their heterogeneity, these architectures support diverse applications but developing parallel algorithms on these architectures can be difficult. In implementing algorithms for heterogeneous systems, proprietary languages are often required, limiting portability. Although general purpose graphics processing units (GPUs) have shown great promise […]
View View   Download Download (PDF)   
Jianbin Fang, Henk Sips, Pekka Jaaskelainen, Ana Lucia Varbanescu
Due to the diversity of processor architectures and application memory access patterns, the performance impact of using local memory in OpenCL kernels has become unpredictable. For example, enabling the use of local memory for an OpenCL kernel can be beneficial for the execution on a GPU, but can lead to performance losses when running on […]
View View   Download Download (PDF)   
Page 1 of 712345...Last »

* * *

* * *

Like us on Facebook

HGPU group

169 people like HGPU on Facebook

Follow us on Twitter

HGPU group

1276 peoples are following HGPU @twitter

* * *

Free GPU computing nodes at hgpu.org

Registered users can now run their OpenCL application at hgpu.org. We provide 1 minute of computer time per each run on two nodes with two AMD and one nVidia graphics processing units, correspondingly. There are no restrictions on the number of starts.

The platforms are

Node 1
  • GPU device 0: AMD/ATI Radeon HD 5870 2GB, 850MHz
  • GPU device 1: AMD/ATI Radeon HD 6970 2GB, 880MHz
  • CPU: AMD Phenom II X6 @ 2.8GHz 1055T
  • RAM: 12GB
  • OS: OpenSUSE 13.1
  • SDK: AMD APP SDK 2.9
Node 2
  • GPU device 0: AMD/ATI Radeon HD 7970 3GB, 1000MHz
  • GPU device 1: nVidia GeForce GTX 560 Ti 2GB, 822MHz
  • CPU: Intel Core i7-2600 @ 3.4GHz
  • RAM: 16GB
  • OS: OpenSUSE 12.2
  • SDK: nVidia CUDA Toolkit 6.0.1, AMD APP SDK 2.9

Completed OpenCL project should be uploaded via User dashboard (see instructions and example there), compilation and execution terminal output logs will be provided to the user.

The information send to hgpu.org will be treated according to our Privacy Policy

HGPU group © 2010-2014 hgpu.org

All rights belong to the respective authors

Contact us: