Posts
Apr, 9
Literature Review: Parallel Computing on linear equations of linear elastic FEM stimulation with CUDA
Scientific computation is the field of study that uses computers to implement mathematical models of physical phenomena such as FEM in deformation measurement in virtual reality. Scientific and engineering problems that would be almost impossible to solve by hand whereas on a computer, it can be handled properly. A numerical algorithm calculating for different fields […]
Apr, 9
Exploring the power of GPU’s for training Deep Belief Networks
One of the major research trends currently is the evolution of heterogeneous parallel computing. GP-GPU computing is being widely used and several applications have been designed to exploit the massive parallelism that GP-GPU’s have to offer. While GPU’s have always been widely used in areas of computer vision for image processing, little has been done […]
Apr, 7
A New Non-Blocking Approach on GPU Dynamical Memory Management
Dynamic memory allocation is a very important and basic technique implemented on modern computer architecture. In the massively parallel processor (MPP) architecture such as Graphics Processing Units (GPUs), many threads try to send allocation or deallocation requests to system in the same time, which could cause the issue of synchronization or race condition. In this […]
Apr, 7
A New Digital Repository for Hyperspectral Imagery with Unmixing-Based Retrieval Functionality Implemented on GPUs
Over the last few years, hyperspectral image data have been collected for a large number of locations over the world, using a variety of instruments for Earth observation. In addition, several new hyperspectral missions will become operational in the near future. Despite the increasing availability and large volume of hyperspectral data in many applications, there […]
Apr, 7
State of the Art Report on Real-time Rendering with Hardware Tessellation
For a long time, GPUs have primarily been optimized to render more and more triangles with increasingly flexible shading. However, scene data itself has typically been generated on the CPU and then uploaded to GPU memory. Therefore, widely used techniques that generate geometry at render time on demand for the rendering of smooth and displaced […]
Apr, 7
Detection of a faint fast-moving near-Earth asteroid using synthetic tracking technique
We report a detection of a faint near-Earth asteroid (NEA), which was done using our synthetic tracking technique and the CHIMERA instrument on the Palomar 200-inch telescope. This asteroid, with apparent magnitude of 23, was moving at 5.97 degrees per day and was detected at a signal-to-noise ratio (SNR) of 15 using 30 sec of […]
Apr, 7
Quantifying the Energy Efficiency of Object Recognition and Optical Flow
In this report, we analyze the computational and performance aspects of current state-of-the-art object recognition and optical flow algorithms. First, we identify important algorithms for object recognition and optical flow, then we perform a pattern decomposition to identify key computations. We include profiles of the runtime and energy efficiency (GFLOPS/W) for our implementation of these […]
Apr, 7
GPU-Accelerated Face Detection Algorithm
This work is an overview of a preliminary experience in developing high-performance face detection accelerated by GPU co-processors. The objective is to illustrate the advantages and difficulties encountered while utilizing the GPU technology to perform face detection. Moreover the introduced implementation is a much faster than currently existing techniques. Previous techniques for speeding up face […]
Apr, 7
A Study on Efficient Application Mapping on Parallel Computing Accelerators
Since the invention of electronic computers, their performance has been constantly advanced. The recent progress of micro processors in performance has been mainly achieved by increasing the number of cores on a device, instead of increasing working frequency. In addition, because of increasing of density of semiconductors, not only computational performance but also density of […]
Apr, 7
Parallel processing for SAR image generation in CUDA – GPGPU platform
High resolution imagery from synthetic aperture radar (SAR) video data requires numerical computations of the order of gigaflops (GFLOP). The computational burden increases with the image size and the amount of input raw video signals. General purpose graphic processor units (GPGPU) can play a pivotal role in parallel processing the raw video data to generate […]
Apr, 7
Acceleration of a Full-scale Industrial CFD Application with OP2
Hydra is a full-scale industrial CFD application used for the design of turbomachinery at Rolls Royce plc. It consists of over 300 parallel loops with a code base exceeding 50K lines and is capable of performing complex simulations over highly detailed unstructured mesh geometries. Unlike simpler structured-mesh applications, which feature high speed-ups when accelerated by […]
Apr, 7
Implementing a Sparse Matrix Vector Product for the SELL-C/SELL-C-sigma formats on NVIDIA GPUs
Numerical methods in sparse linear algebra typically rely on a fast and efficient matrix vector product, as this usually is the backbone of iterative algorithms for solving eigenvalue problems or linear systems. Against the background of a large diversity in the characteristics of high performance computer architectures, it is a challenge to derive a cross-platform […]