10809

Posts

Aug, 15

First experiences with the Intel MIC architecture at LRZ

With the rapidly growing demand for computing power new accelerator based architectures have entered the world of high performance computing since around 5 years. In particular GPGPUs have recently become very popular, however programming GPGPUs using programming languages like CUDA or OpenCL is cumbersome and error-prone. Trying to overcome these difficulties, Intel developed their own […]
Aug, 14

GPU Acceleration of a Basket Option Pricing Engine

One of the most important methods for pricing complex derivatives is Monte Carlo simulation. However, this method requires a large amount of computing resources for accurate estimates. Since Monte Carlo simulations used in derivatives pricing are often parallelisable, one way to reduce the computing time is to use GPUs, which allow many copies of the […]
Aug, 10

Analyzing Optimization Techniques for Power Efficiency on Heterogeneous Platforms

Graphics processing units (GPUs) have become widely accepted as the computing platform of choice in many high performance computing domains. The availability of programming standards such as OpenCL are used to leverage the inherent parallelism offered by GPUs. Source code optimizations such as loop unrolling and tiling when targeted to heterogeneous applications have reported large […]
Aug, 9

A GPU implementation of massively parallel direction splitting for the incompressible Navier-Stokes equations

Guermond and Minev proposed a directional splitting algorithm to solve the incompressible Stokes equations. Their algorithm applies the alternating direction implicit method to the viscosity term. The pressure update uses a direction splitting method in order to enforce the incompressibility constraint, as opposed to commonly used projection methods that require the solution of a Poisson […]
Aug, 9

A GPGPU based program to solve the TDSE in intense laser fields through the finite difference approach

We present a General-purpose computing on graphics processing units (GPGPU) based computational program and framework for the electronic dynamics of atomic systems under intense laser fields. We present our results using the case of hydrogen, however the code is trivially extensible to tackle problems within the single-active electron (SAE) approximation. Building on our previous work, […]
Aug, 7

Automatic Skeleton-Based Compilation through Integration with an Algorithm Classification

This paper presents a technique to fully automatically generate efficient and readable code for parallel processors. We base our approach on skeleton-based compilation and "algorithmic species", an algorithm classification of program code. We use a tool to automatically annotate C code with species information where possible. The annotated program code is subsequently fed into the […]
Aug, 6

Portable Parallel Kernels for High-Speed Beamforming in Synthetic Aperture Ultrasound Imaging

In medical ultrasound, synthetic aperture (SA) imaging is well-considered as a novel image formation technique for achieving superior resolution than that offered by existing scanners. However, its intensive processing load is known to be a challenging factor. To address such a computational demand, this paper proposes a new parallel approach based on the design of […]
Aug, 1

A note on the GPU acceleration of eigenvalue computations

Eigenvalue computations for large sparse matrices such as the Lanczos method are commonly based on Krylov subspace techniques. One of the dominant operations in such algorithms are iterated computations of inner products with the same vector in order to preserve orthogonality of the Krylov basis. These operations can be accelerated by existing BLAS functionality using […]
Aug, 1

Matrix Convolution using Parallel Programming

The convolution theorem is used to multiply matrices of two different sizes i.e. matrices in which the number of rows in the first matrix is not equal to the number of columns in the second matrix. In this study, the multiplication of 3*3 and 4*4 matrices was done using MPI. A 3*3 matrix was taken […]
Jul, 12

A GPGPU-based Pipeline for Accelerated Rendering of Point Clouds

Direct rendering of large point clouds has become common practice in architecture and archaeology in recent years. Due to the high point density no meshes are reconstructed from the scanning data, but the points can be rendered directly as primitives of a graphics API like OpenGL. However, these APIs and the hardware, which they are […]
Jul, 12

SIMD Divergence Optimization through Intra-Warp Compaction

SIMD execution units in GPUs are increasingly used for high performance and energy efficient acceleration of general purpose applications. However, SIMD control flow divergence effects can result in reduced execution efficiency in a class of GPGPU applications, classified as divergent applications. Improving SIMD efficiency, therefore, has the potential to bring significant performance and energy benefits […]
Jul, 7

CrowdCL: Web-Based Volunteer Computing with WebCL

We present CrowdCL, an open-source framework for the rapid development of volunteer computing and OpenCL applications on the web. Drawing inspiration from existing GPU libraries like PyCUDA, CrowdCL provides an abstraction layer for WebCL aimed at reducing boilerplate and improving code readability. CrowdCL also provides developers with a framework to easily run computations in the […]

* * *

* * *

HGPU group © 2010-2024 hgpu.org

All rights belong to the respective authors

Contact us: