11277

Posts

Jan, 15

3rd Workshop on Scalable Computing in Distributed Systems and 8th Workshop on Large Scale Computations on Grids, SCoDiS-LaSCoG’14

The Large Scale Computing in Grids (LaSCoG) workshop originated in 2005, and when it was created we have stated in its preamble that: “The emerging paradigm for execution of large-scale computations, whether they originate as scientific or engineering applications, or for supporting large data-intensive calculations, is to utilize multiple computers at sites distributed across the […]
Jan, 14

Adaptation of an acoustic propagation model to the parallel architecture of a graphics processor

High performance underwater acoustic models are of great importance for enabling real-time acoustic source tracking, geoacoustic inversion, environmental monitoring and high-frequency underwater communications. Given the parallelizable nature of raytracing, in general, and of the ray superposition algorithm in particular, use of multiple computing units for the development of real-time efficient applications based on ray tracing […]
Jan, 14

High Performance Code Generation for Stencil Computation on Heterogeneous Multi-device Architectures

Heterogeneous architectures have been widely used in the domain of high performance computing. On one hand, it allows a designer to use multiple types of computing units and each able to execute the tasks that it is best suited for to increase performance; on the other hand, it brings many challenges in programming for novice […]
Jan, 14

A Pervasive Parallel Framework for Visualization

We are on the threshold of a transformative change in the basic architecture of high-performance computing. The use of accelerator processors, characterized by large core counts, shared but asymmetrical memory, and heavy thread loading, is quickly becoming the norm in high performance computing. These accelerators represent significant challenges in updating our existing base of software. […]
Jan, 14

Optimal Alignment of Three Sequences On A GPU

We develop two algorithms-layered and sloped-to align three sequences on a GPU. Our algorithms can be used to determine the alignment score as well as the actual alignment. Experiments conducted using an NVIDIA C2050 GPU show that our sloped algorithm is 3 times as fast as the layered one. Further, the sloped algorithm delivers a […]
Jan, 14

A hierarchically blocked Jacobi SVD algorithm for single and multiple graphics processing units

We present a hierarchically blocked one-sided Jacobi algorithm for the singular value decomposition (SVD), targeting both single and multiple graphics processing units (GPUs). The blocking structure reflects the levels of GPU’s memory hierarchy. The algorithm may outperform MAGMA’s dgesvd, while retaining high relative accuracy. To this end, we developed a family of parallel pivot strategies […]
Jan, 14

Towards Portable Performance for Explicit Hydrodynamics Codes

Significantly increasing intra-node parallelism is widely recognised as being a key prerequisite for reaching exascale levels of computational performance. In future exascale systems it is likely that this performance improvement will be realised by increasing the parallelism available in traditional CPU devices and using massively-parallel hardware accelerators. The MPI programming model is starting to reach […]
Jan, 14

Parallelization and Optimization of Feature Detection Algorithms on Embedded GPU

In this paper, we parallelize and optimize the popular feature detection algorithms, i.e. SIFT and SURF, on the latest embedded GPU. Using conventional OpenGL shading language and recently developed OpenCL as the GPGPU software platforms, we compare the implementation efficiency and speed performance between each other as well as between GPU and CPU. Experimental result […]
Jan, 14

k+-buffer: Fragment Synchronized k-buffer

k-buffer facilitates novel approaches to multi-fragment rendering and visualization for developing interactive applications on the GPU. Various alternatives have been proposed to alleviate its memory hazards and to avoid completely or partially the necessity of geometry pre-sorting. However, that came with the burden of excessive memory allocation and depth precision artifacts. We introduce k+-buffer, a […]
Jan, 14

High Performance Programming for Soft Computing

This book examines the present and future of soft computer techniques. It explains how to use the latest technological tools, such as multicore processors and graphics processing units, to implement highly efficient intelligent system methods using a general purpose computer.
Jan, 14

GPUs for real-time processing in HEP trigger systems

We describe a pilot project (GAP – GPU Application Project) for the use of GPUs (Graphics processing units) in online triggering applications for High Energy Physics experiments. Two major trends can be identified in the development of trigger and DAQ systems for particle physics experiments: the massive use of general-purpose commodity systems such as commercial […]
Jan, 12

A Framework for Productive, Efficient and Portable Parallel Computing

Developing efficient parallel implementations and fully utilizing the available resources of parallel platforms is now required for software applications to scale to new generations of processors. Yet, parallel programming remains challenging to programmers due to the requisite low-level knowledge of the underlying hardware and parallel computing constructs. These restrictions in turn impede experimentation with various […]

* * *

* * *

HGPU group © 2010-2025 hgpu.org

All rights belong to the respective authors

Contact us: