Posts
May, 7
Simulating the universe with GPU-accelerated supercomputers: n-body methods, tests, and examples
We demonstrate the acceleration obtained from using GPU/CPU hybrid clusters and supercomputers for N-body simulations of gravity based in part on the author’s new code development. Validation tests are shown for cosmological simulations and for galaxy simulations, along with their respective speedups compared to traditional simulations. Potential new applications for science enabled by this advance […]
May, 7
Critical Links Detection using CUDA
The Critical Links Detection (CLD) Problem consists of finding for the smallest set of edges in a graph to be protected so that if a given number of unprotected edges are removed the diameter does not exceed a given value. The diameter of a graph is defined as the length of the All-PairShortest-Path (APSP). This […]
May, 7
Optimizing CUDA Code By Kernel Fusion – Application on BLAS
Modern GPUs are able to perform significantly more arithmetic operations than transfers of a single word to or from global memory. Hence, many GPU kernels are limited by memory bandwidth and cannot exploit the arithmetic power of GPUs. However, the memory locality can be often improved by kernel fusion when a sequence of kernels is […]
May, 6
Accelerating Financial Applications on the GPU
The QuantLib library is a popular library used for many areas of computational finance. In this work, the parallel processing power of the GPU is used to accelerate QuantLib financial applications. Black-Scholes, Monte-Carlo, Bonds, and Repo code paths in QuantLib are accelerated using hand-written CUDA and OpenCL codes specifically targeted for the GPU. Additionally, HMPP […]
May, 6
Algorithms for Rapid Characterization and Optimization of Aperture and Reflector Antennas
Reflector antennas play a key role in the communication industry, and enhancing the speed of the analysis of reflector antenna systems can provide better responsiveness to the needs of industry as well as promote better understanding of software modeling through faster visualization. A reflector antenna system typically consists of a feed assembly, with a feedhorn […]
May, 6
Simulation of Biological Tissue using Mass-Spring-Damper Models
The goal of this project was to evaluate the viability of a mass-spring-damper based model for modeling of biological tissue. A method for automatically generating such a model from data taken from 3D medical imaging equipment including both the generation of point masses and an algorithm for generating the spring-damper links between these points is […]
May, 6
Fast Implementation of Scale Invariant Feature Transform Based on CUDA
Scale-invariant feature transform (SIFT) was an algorithm in computer vision to detect and describe local features in images. Due to its excellent performance, SIFT was widely used in many applications, but the implementation of SIFT was complicated and time-consuming. To solve this problem, this paper presented a novel acceleration algorithm for SIFT implementation based on […]
May, 6
Fast computation of MadGraph amplitudes on graphics processing unit (GPU)
Continuing our previous studies on QED and QCD processes, we use the graphics processing unit (GPU) for fast calculations of helicity amplitudes for general Standard Model (SM) processes. Additional HEGET codes to handle all SM interactions are introduced, as well assthe program MG2CUDA that converts arbitrary MadGraph generated HELAS amplitudess(FORTRAN) into HEGET codes in CUDA. […]
May, 4
Real-time Stochastic Optimization of Complex Energy Systems on High Performance Computers
We present a scalable approach that computes in operationally-compatible time the energy dispatch under uncertainty for complex energy systems of realistic size. Complex energy systems, such as the US power grid, are affected by increased uncertainty of its target power sources, due for example to increasing penetration of wind power coupled with the physical impossibility […]
May, 4
Exploration of Multifrontal Method with GPU in Power Flow Computation
Solving sparse linear equations is the key part of power system analysis. The Newton-Raphson and its variations require repeated solution of sparse linear equations; therefore improvement in efficiency of solving sparse linear equations will accelerate the overall power system analysis. This work integrates multifrontal method and graphic processing unit (GPU) linear algebra library to solve […]
May, 4
GPUWattch: Enabling Energy Optimizations in GPGPUs
General-purpose GPUs (GPGPUs) are becoming prevalent in mainstream computing, and performance per watt has emerged as a more crucial evaluation metric than peak performance. As such, GPU architects require robust tools that will enable them to quickly explore new ways to optimize GPGPUs for energy efficiency. We propose a new GPGPU power model that is […]
May, 4
Implementations of the FFT algorithm on GPU
The fast Fourier transform (FFT) plays an important role in digital signal processing (DSP) applications, and its implementation involves a large number of computations. Many DSP designers have been working on implementations of the FFT algorithms on different devices, such as central processing unit (CPU), Field programmable gate array (FPGA), and graphical processing unit (GPU), […]

