7016

Posts

Jan, 17

Four-dimensional Cone Beam CT Reconstruction and Enhancement using a Temporal Non-Local Means Method

Four-dimensional Cone Beam Computed Tomography (4D-CBCT) has been developed to provide respiratory phase resolved volumetric imaging in image guided radiation therapy (IGRT). Inadequate number of projections in each phase bin results in low quality 4D-CBCT images with obvious streaking artifacts. In this work, we propose two novel 4D-CBCT algorithms: an iterative reconstruction algorithm and an […]
Jan, 17

Finding Convex Hulls Using Quickhull on the GPU

We present a convex hull algorithm that is accelerated on commodity graphics hardware. We analyze and identify the hurdles of writing a recursive divide and conquer algorithm on the GPU and divise a framework for representing this class of problems. Our framework transforms the recursive splitting step into a permutation step that is well-suited for […]
Jan, 16

Programming on Parallel Machines: GPU, Multicore, Clusters and More

This open-source textbook on parallel programming is aimed more on the practical end of things, in that: There is very little theoretical content, such as O() analysis, maximum theoretical speedup, acyclic graphs and so on; Real code is featured throughout; We use the main parallel platforms-OpenMP, CUDA and MPI-rather than languages that at this stage […]
Jan, 16

FPGA Based Acceleration of Decimal Operations

Field Programmable Gate-Arrays (FPGAs) can efficiently implement application specific processors in non-conventional number systems, such as the decimal (Binary-Coded Decimal, or BCD) number system required for accounting accuracy in financial applications. The main purpose of this work is to show that applications requiring several decimal (BCD) operations can be accelerated by a processor implemented on […]
Jan, 16

A Modular System Architecture for Online Parallel Vision Pipelines

We present an architecture for real-time, online vision systems which enables development and use of complex vision pipelines integrating any number of algorithms. Individual algorithms are implemented using modular plugins, allowing integration of independently developed algorithms and rapid testing of new vision pipeline configurations. The architecture exploits the parallelization of graphics processing units (GPUs) and […]
Jan, 16

Fast Regularization of Matrix-Valued Images

Regularization of matrix-valued data is of importance in medical imaging, motion analysis and scene understanding. In this report we describe a novel method for efficient regularization of matrix group-valued images. Using the augmented Lagrangian framework we separate the total-variation regularization of matrix-valued images into a regularization and projection steps, both of which are fast and […]
Jan, 16

Theano: Deep Learning on GPUs with Python

In this paper, we present Theano, a framework in the Python programming language for defining, optimizing and evaluating expressions involving high-level operations on tensors. Theano offers most of NumPy’s functionality, but adds automatic symbolic differentiation, GPU support, and faster expression evaluation. Theano is a general mathematical tool, but it was developed with the goal of […]
Jan, 16

A Fast Jet Finder Algorithm Using Graphic Processing Unit

A collimated emission of hadrons usually called Jet is the experimental counterparts of the partons (quarks and gluons) which are not observed separately. The CMS detector at LHC is ideally designed to study jet tomography which is an important probe to investigate the hot and dense medium formed during the heavy ion collisions. Although CMS […]
Jan, 16

Data Triage and Visual Analytics for Scientific Visualization

As the speed of computers continues to increase at a very fast rate, the size of data generated from scientific simulations has now reached petabytes ($10^{12}$ bytes) and beyond. Under such circumstances, no existing techniques can be used to perform effective data analysis at a full precision. To analyze large scale data sets, visual analytics […]
Jan, 16

Feature Extraction and Visualization from Higher-Order CFD Data

Computational fluid dynamics (CFD) methods have been employed in the studies of subjects such as aeroacoustics, gas dynamics, turbo machinery, viscoelastic fluids, among others. However, the need for accuracy and high performance resulted in methods whose solutions are becoming increasingly more complex. In this context, feature extraction and visualization methods play a key role, making […]
Jan, 16

Parallel AES Encryption Engines for Many-Core Processor Arrays

By exploring different granularities of data-level and task-level parallelism, we map 16 implementations of an Advanced Encryption Standard (AES) encipher with both online and offline key expansion on a fine-grained many-core system. The smallest design utilizes only 6 cores for offline key expansion and 8 cores for online key expansion, while the largest requires 107 […]
Jan, 16

Nested Intervals Tree Encoding with System of Residual Classes

This paper describes one of ways to represent tree-like structures in Data Bases. Authors suggest to expand V. Tropashko nested intervals model. Numbers in intervals are described in system of residual classes. It allows to avoid storage of big numbers and easily implemented by parallel-programming algorithms.

* * *

* * *

HGPU group © 2010-2024 hgpu.org

All rights belong to the respective authors

Contact us: