Posts
Jan, 16
VertexAPI2 – A Vertex-Program API for Large Graph Computations on the GPU
VertexAPI2 uses state-of-the-art GPU algorithms to implement the Gather-Apply-Scatter (GAS) abstraction for graph computations. VertexAPI2 provides up to an order of magnitude greater performance over the previous implementation and performance comparable to speed-of-light hand-coded algorithms in some cases, while retaining the simplicity of development of the GAS model. The current code also has a preliminary […]
Jan, 16
Improving Student Learning in Computer Science Courses by Using Virtual OpenCL Laboratory
Laboratory experience is an essential factor for engineering and science education. Virtual laboratories are widely used by universities and research institutions in various kinds of academic sectors. However, general virtual laboratories always have some weakness for computer graphics which its experiment needs to be done in high performance computers. In the assessment of a graduate […]
Jan, 15
3rd Workshop on Scalable Computing in Distributed Systems and 8th Workshop on Large Scale Computations on Grids, SCoDiS-LaSCoG’14
The Large Scale Computing in Grids (LaSCoG) workshop originated in 2005, and when it was created we have stated in its preamble that: “The emerging paradigm for execution of large-scale computations, whether they originate as scientific or engineering applications, or for supporting large data-intensive calculations, is to utilize multiple computers at sites distributed across the […]
Jan, 14
Adaptation of an acoustic propagation model to the parallel architecture of a graphics processor
High performance underwater acoustic models are of great importance for enabling real-time acoustic source tracking, geoacoustic inversion, environmental monitoring and high-frequency underwater communications. Given the parallelizable nature of raytracing, in general, and of the ray superposition algorithm in particular, use of multiple computing units for the development of real-time efficient applications based on ray tracing […]
Jan, 14
High Performance Code Generation for Stencil Computation on Heterogeneous Multi-device Architectures
Heterogeneous architectures have been widely used in the domain of high performance computing. On one hand, it allows a designer to use multiple types of computing units and each able to execute the tasks that it is best suited for to increase performance; on the other hand, it brings many challenges in programming for novice […]
Jan, 14
A Pervasive Parallel Framework for Visualization
We are on the threshold of a transformative change in the basic architecture of high-performance computing. The use of accelerator processors, characterized by large core counts, shared but asymmetrical memory, and heavy thread loading, is quickly becoming the norm in high performance computing. These accelerators represent significant challenges in updating our existing base of software. […]
Jan, 14
Optimal Alignment of Three Sequences On A GPU
We develop two algorithms-layered and sloped-to align three sequences on a GPU. Our algorithms can be used to determine the alignment score as well as the actual alignment. Experiments conducted using an NVIDIA C2050 GPU show that our sloped algorithm is 3 times as fast as the layered one. Further, the sloped algorithm delivers a […]
Jan, 14
A hierarchically blocked Jacobi SVD algorithm for single and multiple graphics processing units
We present a hierarchically blocked one-sided Jacobi algorithm for the singular value decomposition (SVD), targeting both single and multiple graphics processing units (GPUs). The blocking structure reflects the levels of GPU’s memory hierarchy. The algorithm may outperform MAGMA’s dgesvd, while retaining high relative accuracy. To this end, we developed a family of parallel pivot strategies […]
Jan, 14
Towards Portable Performance for Explicit Hydrodynamics Codes
Significantly increasing intra-node parallelism is widely recognised as being a key prerequisite for reaching exascale levels of computational performance. In future exascale systems it is likely that this performance improvement will be realised by increasing the parallelism available in traditional CPU devices and using massively-parallel hardware accelerators. The MPI programming model is starting to reach […]
Jan, 14
Parallelization and Optimization of Feature Detection Algorithms on Embedded GPU
In this paper, we parallelize and optimize the popular feature detection algorithms, i.e. SIFT and SURF, on the latest embedded GPU. Using conventional OpenGL shading language and recently developed OpenCL as the GPGPU software platforms, we compare the implementation efficiency and speed performance between each other as well as between GPU and CPU. Experimental result […]
Jan, 14
k+-buffer: Fragment Synchronized k-buffer
k-buffer facilitates novel approaches to multi-fragment rendering and visualization for developing interactive applications on the GPU. Various alternatives have been proposed to alleviate its memory hazards and to avoid completely or partially the necessity of geometry pre-sorting. However, that came with the burden of excessive memory allocation and depth precision artifacts. We introduce k+-buffer, a […]
Jan, 14
High Performance Programming for Soft Computing
This book examines the present and future of soft computer techniques. It explains how to use the latest technological tools, such as multicore processors and graphics processing units, to implement highly efficient intelligent system methods using a general purpose computer.