7224

Posts

Feb, 9

The Boat Hull Model: Adapting the Roofline Model to Enable Performance Prediction for Parallel Computing

Multi-core and many-core were already major trends for the past six years, and are expected to continue for the next decades. With these trends of parallel computing, it becomes increasingly difficult to decide on which architecture to run a given application. In this work, we use an algorithm classification to predict performance prior to algorithm […]
Feb, 9

CudaRF: A CUDA-based Implementation of Random Forests

Machine learning algorithms are frequently applied in data mining applications. Many of the tasks in this domain concern high-dimensional data. Consequently, these tasks are often complex and computationally expensive. This paper presents a GPU-based parallel implementation of the Random Forests algorithm. In contrast to previous work, the proposed algorithm is based on the compute unified […]
Feb, 9

Real-time simulation of a spiking neural network model of the basal ganglia circuitry using general purpose computing on graphics processing units

Real-time simulation of a biologically realistic spiking neural network is necessary for evaluation of its capacity to interact with real environments. However, the real-time simulation of such a neural network is difficult due to its high computational costs that arise from two factors: (1) vast network size and (2) the complicated dynamics of biologically realistic […]
Feb, 8

Auto-Generation and Auto-Tuning of 3D Stencil Codes on GPU Clusters

This paper develops and evaluates search and optimization techniques for auto-tuning 3D stencil (nearest-neighbor) computations on GPUs. Observations indicate that parameter tuning is necessary for heterogeneous GPUs to achieve optimal performance with respect to a search space. Our proposed framework takes a most concise specification of stencil behavior from the user as a single formula, […]
Feb, 8

The PEPPHER Approach to Programmability and Performance Portability for Heterogeneous many-core Architectures

The European FP7 project PEPPHER is addressing programmability and performance portability for current and emerging heterogeneous many-core archi- tectures. As its main idea, the project proposes a multi-level parallel execution model comprised of potentially parallelized components existing in variants suitable for different types of cores, memory configurations, input characteristics, optimization criteria, and couples this with […]
Feb, 8

Acceleration of a Locally Tuned Sine Non Linear Video Enhancement Algorithm on GPGPU

Computer Vision based applications support various domains such as medical, manufacturing, military intelligence and surveillance systems. These applications can be divided into: image acquisition, pre-processing, feature extraction, detection or segmentation, and high-level processing. However these tasks are time intensive due to the compute bound nature of the algorithm. In this thesis, an algorithm, based on […]
Feb, 8

Symbolic Testing of OpenCL Code

We present an effective technique for crosschecking a C or C++ program against an accelerated OpenCL version, as well as a technique for detecting data races in OpenCL programs. Our techniques are implemented in KLEE-CL, a symbolic execution engine based on KLEE and KLEE-FP that supports symbolic reasoning on the equivalence between symbolic values. Our […]
Feb, 8

Verifiable Computation with Massively Parallel Interactive Proofs

As the cloud computing paradigm has gained prominence, the need for verifiable computation has grown increasingly urgent. The concept of verifiable computation enables a weak client to outsource difficult computations to a powerful, but untrusted, server. Protocols for verifiable computation aim to provide the client with a guarantee that the server performed the requested computations […]
Feb, 7

GMP implementation on CUDA – A Backward Compatible Design With Performance Tuning

The goal of this project is to implement the GMP library in CUDA and evaluate its performance. GMP (GNU Multiple Precision) is a free library for arbitrary precision arithmetic, operating on signed integers, rational numbers, and floating point numbers. There is no practical limit to the precision except the ones implied by the available memory […]
Feb, 7

Enabling Traceability in MDE to Improve Performance of GPU Applications

Graphics Processor Units (GPUs) are known for offering high performance and power efficiency for processing algorithms that suit well to their massively parallel architecture. Unfortunately, as parallel programming for this kind of architecture requires a complex distribution of tasks and data, developers find it difficult to implement their applications effectively. Although approaches based on source-to-source […]
Feb, 7

Seismic imaging based on spectral differentiation matrix and GPU implementation

Finite-difference depth migration based on one-way wave equation uses second-order, fourth-order, or other finite-order approximations for spatial derivatives. These finite-order approximations often lead to spatial dispersion errors and low accuracy. To avoid these errors, smaller mesh spacings are used, which results in huge increase in computation cost. In this paper, we develop a new spectral […]
Feb, 7

Stochastic Differential Equations simulation using GPU

We discretize generic stochastic differential equation(SDE)s using Euler and Milstein schemes. We propose GPU based random number generation GPURNG. Using GPURNG, Euler and Milstein methods, we derive algorithms with which we solve the underlying SDE. For a test case, we show the simulation results for European options. We shows that our algorithms give greater than […]

* * *

* * *

HGPU group © 2010-2024 hgpu.org

All rights belong to the respective authors

Contact us: