5293

Posts

Aug, 18

Physical and graphical effects in OpenCL by example

There are strong indications that the future of interactive graphics involves a more flexible programming model than today’s OpenGL/Direct3D pipelines. That means that graphics developers will need a basic understanding of how to combine emerging parallel-programming techniques with the traditional interactive rendering pipeline. This course provides an introduction to parallel-programming architectures and environments for interactive […]
Aug, 18

Parallelization of the x264 encoder using OpenCL

With the introduction of H.264, the complexity on video encoders has increased dramatically. As hardware based encoding solutions profit from the strict sequential design and already feature real time capabilities for high definition material, software solutions lack most of the encoding performance. More precisely, the performance of software encoders is limited due to the computation […]
Aug, 18

Simulating Biological-Inspired Spiking Neural Networks with OpenCL

The algorithms used for simulating biologically-inspired spiking neural networks (BIANN) often utilize functions which are computationally complex and have to model a large number of neurons – or even a much larger number of synapses in parallel. To use all available computing resources provided by a standard desktop PC is an opportunity to shorten the […]
Aug, 18

Parallel Batch Training of the Self-Organizing Map Using OpenCL

The Self-Organizing Maps (SOMs) are popular artificial neural networks that are often used for data analyses through clustering and visualisation. SOM’s mathematical model is inherently parallel. However, many implementations have not successfully exploited its parallelism because previous attempts often required cluster-like infrastructures. This article presents the parallel implementation of SOMs, particularly the batch map variant […]
Aug, 18

Maestro: Data Orchestration and Tuning for OpenCL Devices

As heterogeneous computing platforms become more prevalent, the programmer must account for complex memory hierarchies in addition to the difficulties of parallel programming. OpenCL is an open standard for parallel computing that helps alleviate this difficulty by providing a portable set of abstractions for device memory hierarchies. However, OpenCL requires that the programmer explicitly controls […]
Aug, 18

Optimizing the exploitation of multicore processors and GPUs with OpenMP and OpenCL

In this paper, we present OMPSs, a programming model based on OpenMP and StarSs, that can also incorporate the use of OpenCL or CUDA kernels. We evaluate the proposal on three different architectures, SMP, Cell/B.E. and GPUs, showing the wide usefulness of the approach. The evaluation is done with four different benchmarks, Matrix Multiply, BlackScholes, […]
Aug, 18

Analyzing program flow within a many-kernel OpenCL application

Many developers have begun to realize that heterogeneous multi-core and many-core computer systems can provide significant performance opportunities to a range of applications. Typical applications possess multiple components that can be parallelized; developers need to be equipped with proper performance tools to analyze program flow and identify application bottlenecks. In this paper, we analyze and […]
Aug, 17

Near real-time Fast Bilateral Stereo on the GPU

State of the art local stereo correspondence algorithms that adapt their supports to image content allow to infer very accurate disparity maps often comparable to algorithms based on global disparity optimization methods. However, despite their effectiveness, accurate local approaches based on this methodology are also computationally expensive and several simplifications aimed at reducing their computational […]
Aug, 17

Fast boosting trees for classification, pose detection, and boundary detection on a GPU

Discriminative classifiers are often the computational bottleneck in medical imaging applications such as foreground/background classification, 3D pose detection, and boundary delineation. To overcome this bottleneck, we propose a fast technique based on boosting tree classifiers adapted for GPU computation. Unlike standard tree-based algorithms, our method does not have any recursive calls which makes it GPU-friendly. […]
Aug, 17

GPU-based reconstruction and display for 4D ultrasound data

Due to the required computational effort of 4D ultrasound imaging, such systems depend on low complexity techniques like nearest neighbor interpolation, which affects volume quality. Moreover, more accurate techniques like normalized convolution, backward trilinear interpolation, and forward spherical and ellipsoidal Gaussian kernel, are avoided in real-time imaging because of the tight reconstruction time. The goal […]
Aug, 17

nGFSIM: A GPU-based fault simulator for 1-to-n detection and its applications

We present nGFSIM, a GPU-based fault simulator for stuck-at faults which can report the fault coverage of one-to n-detection for any specified integer n using only a single run of fault simulation. nGFSIM, which explores the massive parallelism in the GPU architecture and optimizes the memory access and usage, enables accelerated fault simulation without the […]
Aug, 17

GPU accelerated FDTD solver and its application in MRI

The finite difference time domain (FDTD) method is a popular technique for computational electromagnetics (CEM). The large computational power often required, however, has been a limiting factor for its applications. In this paper, we will present a graphics processing unit (GPU)-based parallel FDTD solver and its successful application to the investigation of a novel B1 […]
Page 574 of 866« First...102030...572573574575576...580590600...Last »

* * *

* * *

Follow us on Twitter

HGPU group

1863 peoples are following HGPU @twitter

Like us on Facebook

HGPU group

407 people like HGPU on Facebook

HGPU group © 2010-2016 hgpu.org

All rights belong to the respective authors

Contact us: