Posts
Aug, 29
Fast network communities visualization on massively parallel GPU architecture
Modeling phenomena with networks has a wide application in many disciplines including biology, economics, sociology, and computer science. In network analysis modularity is an important measure for automatically extracting communities of closely connected nodes. Another important aspect of the network analysis is network visualization. Different techniques for network layout generation exist and the force-driven layout […]
Aug, 29
Efficient Sparse Matrix-Vector Multiplication on x86-Based Many-Core Processors
Sparse matrix-vector multiplication (SpMV) is an important kernel in many scientific applications and is known to be memory bandwidth limited. On modern processors with wide SIMD and large numbers of cores, we identify and address several bottlenecks which may limit performance even before memory bandwidth: (a) low SIMD efficiency due to sparsity, (b) overhead due […]
Aug, 29
Numerical simulations of acoustic waves with the graphic acceleration GAMER code
We present results of numerical simulations of acoustic waves with the use of the Graphics Processing Unit (GPU) acceleration GAMER code which implements a second-order Godunov-type numerical scheme and adaptive mesh refinement (AMR). The AMR implementation is based on constructing a hierarchy of grid patches with an octree data structure. In this code a hybrid […]
Aug, 29
Towards High-Performance and Cost-Effective Distributed Storage Systems with Information Dispersal Algorithms
Reliability is one of the most fundamental challenges for high performance computing (HPC) and cloud computing. Data replication is the de facto mechanism to achieve high reliability, even though it has been criticized for its high cost and low efficiency. Recent research showed promising results by switching the traditional data replication to a software-based RAID. […]
Aug, 28
On the Use of Graphics Processing Units (GPUs) for Molecular Dynamics Simulation of Spherical Particles
General-purpose computation on Graphics Processing Units (GPU) on personal computers has recently become an attractive alternative to parallel computing on clusters and supercomputers. We present the GPU-implementation of an accurate molecular dynamics algorithm for a system of spheres. The new hybrid CPU-GPU implementation takes into account all the degrees of freedom, including the quaternion representation […]
Aug, 28
Throughput-Oriented Analytical Models for Performance Estimation on Programmable Hardware Accelerators
In this thesis work, we have mainly worked on two topics of GPU performance analysis. First, we have developed an analytical method and a timing estimation tool (TEG) to predict CUDA application’s performance for GT200 generation GPUs. TEG can predict GPU applications’ performance in cycle-approximate level. Second, we have developed an approach to estimate GPU […]
Aug, 28
Audiovisual Voice Activity Detection and Localization of Simultaneous Speech Sources
Given the tendency of creating interfaces between human and machines that increasingly allow simple ways of interaction, it is only natural that research effort is put into techniques that seek to simulate the most conventional mean of communication humans use: the speech. In the human auditory system, voice is automatically processed by the brain in […]
Aug, 28
In-Situ Statistical Analysis of Autotune Simulation Data using Graphical Processing Units
Developing accurate building energy simulation models to assist energy efficiency at speed and scale is one of the research goals of the Whole-Building and Community Integration group, which is a part of Building Technologies Research and Integration Center (BTRIC) at Oak Ridge National Laboratory (ORNL). The aim of the Autotune project is to speed up […]
Aug, 28
Dynamic Load Balancing on Massively Parallel Computer Architectures
This thesis reports on using dynamic load balancing methods on massively parallel computers in the context of multi-threaded computations. In particular we investigate the applicability of a randomized work stealing algorithm to ray tracing and breadth-first search as representatives of real-world applications with dynamic work creation. For our considerations we made use of current massively […]
Aug, 27
The development and expansion of HOOMD-blue through six years of GPU proliferation
HOOMD-blue is the first general purpose MD code built from the ground up for GPU acceleration, and has been actively developed since March 2007. It supports a variety of force fields and integrators targeted at soft-matter simulations. As an open source project, numerous developers have contributed useful feature additions back to the main code. High […]
Aug, 27
Compilation techniques and language support to facilitate dependence-driven computation
As the demand increases for high performance and power efficiency in modern computer runtime systems and architectures, programmers are left with the daunting challenge of fully exploiting these systems for efficiency, high-level expressibility, and portability across different computing architectures. Emerging programming models such as the task-based runtime StarPU and many-core architectures such as GPUs force […]
Aug, 27
Solutions for Optimizing the Monte Carlo Option Pricing Method’s Implementation Using the Compute Unified Device Architecture
Finance-related problems require more and more computations; therefore, the problem of finding efficient implementations for option pricing models on modern architectures has become an important challenge. Although there are numerous implementations of the Monte Carlo method on central processing units, many of them face limitations arising from the necessary increased computational power. In this paper, […]