6504

Posts

Nov, 30

Unleashing the Power of Distributed CPU/GPU Architectures: Massive Astronomical Data Analysis and Visualization case study

Upcoming and future astronomy research facilities will systematically generate terabyte-sized data sets moving astronomy into the Petascale data era. While such facilities will provide astronomers with unprecedented levels of accuracy and coverage, the increases in dataset size and dimensionality will pose serious computational challenges for many current astronomy data analysis and visualization tools. With such […]
Nov, 30

GPU-Accelerated SPH Model for Water Waves and Other Free Surface Flows

This paper discusses the meshless numerical method Smoothed Particle Hydrodynamics and its application to water waves and nearshore circulation. In particularly we focus on an implementation of the model on the graphics processing unit (GPU) of computers, which permits low-cost supercomputing capabilities for certain types of computational problems. The implementation here runs on Nvidia graphics […]
Nov, 29

Parallel preconditioning for spherical harmonics expansions of the Boltzmann transport equation

While the Monte Carlo method for the Boltzmann transport equation for semiconductors has already been parallelized, this is much more difficult to accomplish for the deterministic spherical harmonics expansion method which requires the solution of a linear system of equations. For the typically employed iterative solvers, preconditioners are required to obtain good convergence rates. These […]
Nov, 29

Multiresolution Flow Simulations on Multi/many-core Architectures

One of the key challenges in Computational Science is closing the gap between the available computer power and its effective utilization for the simulation of complex physical systems and engineering applications. In order to achieve this goal we must minimize the time-to-solution and the related energy requirements of simulations by developing scalable software and methods […]
Nov, 29

Applications Performance on GPGPUs with the Fermi Architecture

The latest GPU architecture released by Nvidia, code-named "Fermi", is the most advanced computing GPU architecture ever built. Radical changes took place on the GPU computing architecture compared to Fermi’s predecessors such as the GT200 series and the G80s. In this dissertation the Fermi architecture is analysed, addressing the most prominent upgrades, by running extensive […]
Nov, 29

Dynamic Task Parallelism with a GPU Work-Stealing Runtime System

NVIDIA’s Compute Unified Device Architecture (CUDA) and its attached C/C++ based API went a long way towards making GPUs more accessible to mainstream programming. So far, the use of GPUs for high performance computing has been primarily restricted to data parallel applications, and with good reason. The high number of computational cores and high memory […]
Nov, 29

Directives Based Programming of GPU Accelerated Systems

Graphics Processing Units (GPUs) are commodity chips primarily used as coprocessors for processing high definition graphics on a computer system. It possess faster processing power and efficiency in handling accurate single and double floating point numbers with less power consumption compared to CPUs. Realising its potential in general purpose computing manufacturers of these chips have […]
Nov, 29

Enabling Efficient Online Profiling of Homogeneous and Heterogeneous Multicore Systems

Using profiling tools is a common way to understand computer systems and software and to achieve the best performance. Profiling becomes more important as computing technology advances and makes it more difficult to intuitively reason about system characteristics. However, the recent shift in computing technology to multicore systems and heterogeneous systems requires new profiling methods […]
Nov, 29

GPGPU Volume Classification using SimpleOpenCL

In volume visualization, the definition of the regions of interest is inherently an iterative trialand-error process finding out the best parameters to classify and render the final image. In this work, we present a general framework for training multi-class classifiers using Error-Correcting Output Codes. Moreover, we propose a GPGPU parallelization system using SimpleOpenCL, an OpenSource […]
Nov, 29

Solving Dense Generalized Eigenproblems on Multi-threaded Architectures

We compare two approaches to compute a portion of the spectrum of dense symmetric definite generalized eigenproblems: one is based on the reduction to tridiagonal form, and the other on the Krylov-subspace iteration. Two large-scale applications, arising in molecular dynamics and material science, are employed to investigate the contributions of the application, architecture, and parallelism […]
Nov, 29

Electric polarizability of hadrons with overlap fermions on multi-GPUs

Electric polarizability is an important parameter for the internal structure of hadrons. Previous studies of polarizabilities have been done at relatively heavy pion masses, leaving the chiral region largely unexplored. In this report, we use overlap fermions which are known to be computationally demanding to properly capture the chiral dynamics. We present an implementation strategy […]
Nov, 29

A GPU-based survey for millisecond radio transients using ARTEMIS

Astrophysical radio transients are excellent probes of extreme physical processes originating from compact sources within our Galaxy and beyond. Radio frequency signals emitted from these objects provide a means to study the intervening medium through which they travel. Next generation radio telescopes are designed to explore the vast unexplored parameter space of high time resolution […]

* * *

* * *

HGPU group © 2010-2024 hgpu.org

All rights belong to the respective authors

Contact us: