3099

Posts

Feb, 21

Software-Based ECC for GPUs

Commodity off-the-shelf GPUs lack error checking mechanisms for graphics memory, whereas conventional HPC platforms have used hardware-based ECC for DRAMs. To alleviate this reliability concern, we propose a software-based ECC for GPGPU applications. We add small program codes to normal CUDA programs that compute ECCs for data residing in graphics memory so that transient bit-flips […]
Feb, 21

Accelerated Root Finding for Computational Finance

A parallel implementation of root finding on an SIMD application accelerator is reported. These are roots of stochastic differential equations in the computational finance domain which require a stochastic simulation to be performed for each evaluation of the pricing function. Experiments show that a speedup of 15X can be achieved over using a stand-alone CPU […]
Feb, 21

An Automated Approach for SIMD Kernel Generation for GPU based Software Acceleration

Graphics Processing Units (GPUs) are highly parallel Single Instruction Multiple Data (SIMD) engines, with extremely high degrees of available hardware parallelism. The task of implementing a software routine on a GPU currently requires significant manual design, iteration and experimentation. This paper presents an automated approach to partition a software application into kernels (which are executed […]
Feb, 21

Assembling large mosaics of electron microscope images using GPU

Understanding the neural circuitry of the retina requires us to map the connectivity of individual neurons in large neuronal tissue sections and analyze signal communication across processes from the electron microscopy images. One of the major bottlenecks in the critical path is the image mosaicing process where 2D slices are assembled from scanned microscopy image […]
Feb, 21

Accelerating the Stochastic Simulation Algorithm

In order for scientists to learn more about molecular biology, it is imperative that they have the ability to construct accurate models that predict the reactions of species of molecules. Generating these models using deterministic approaches is not feasible as these models may violate some of the assumptions underlying classical differential equations models (e.g., small […]
Feb, 21

Acceleration of Binomial Options Pricing via Parallelizing along time-axis on a GPU

Since the introduction of organized trading of options for commodities and equities, computing fair prices for options has been an important problem in financial engineering. A variety of numerical methods, including Monte Carlo methods, binomial trees, and numerical solution of stochastic differential equations, are used to compute fair prices. Traders and brokerage firms constantly strive […]
Feb, 21

A massively parallel framework using P systems and GPUs

Since CUDA programing model appeared on the general purpose computations, the developers can extract all the power contained in GPUs (Graphics Processing Unit) across many computational domains. Among these domains, P systems or membrane systems provide a high level computational modeling framework that allows, in theory, to obtain polynomial time solutions to NP-complete problems by […]
Feb, 21

GPU Acceleration of Equations Assembly in Finite Elements Method – Preliminary Results

The finite element method (FEM) is widely used for numerical solution of partial differential equations. Two computationally expensive tasks have to be performed in FEM – equations assembly and solution of the system of equations. We present mapping of the equations assembly problem for StVenant-Kirchhoff material to GPU computation model and show results of its […]
Feb, 21

Data parallel loop statement extension to CUDA: GpuC

In recent years, Graphics Processing Units (GPUs) have emerged as a powerful accelerator for general-purpose computations. GPUs are attached to every modern desktop and laptop host CPU as graphics accelerators. GPUs have over a hundred cores with lots of parallelism. Initially, they were used only for graphics applications such as image processing and video games. […]
Feb, 21

APEnet+: high bandwidth 3D torus direct network for petaflops scale commodity clusters

We describe herein the APElink+ board, a PCIe interconnect adapter featuring the latest advances in wire speed and interface technology plus hardware support for a RDMA programming model and experimental acceleration of GPU networking; this design allows us to build a low latency, high bandwidth PC cluster, the APEnet+ network, the new generation of our […]
Feb, 20

Final Project Implementing Extremely Randomized Trees in CUDA

In this paper, we present an implementation of extremely randomized trees (ERT), a supervised machine learning algorithm utilizing decision tree ensembles, in CUDA, nVidia’s GPU parallel programming extensions for C/C++. We describe the CUDA programming model and NVIDIA GPU architectures and explain the design tradeoffs that we made to exploit various forms of parallelism available […]
Feb, 20

Architecting graphics processors for non-graphics compute acceleration

This paper discusses the emergence of graphics processing units (GPUs) that contain architecture features for accelerating non-graphics (or GPGPU) applications. It provides an introduction for those interested in undertaking research at the intersection of manycore computing and GPU architecture. First, the motivation for using GPUs for non-graphics processing rather than developing specialized hardware is outlined. […]

* * *

* * *

HGPU group © 2010-2024 hgpu.org

All rights belong to the respective authors

Contact us: