7018

Posts

Jan, 17

Simulation Valuation of Multiple Exercise Options

Multiple exercise options generalize American-style options as they allow the holder multiple exercise rights and control over the exercise amounts. They arise in both real and financial option applications, such as tolling agreements and swing options which are primarily used in the energy industry. The Forest of Stochastic Meshes is a recently proposed simulation method […]
Jan, 17

A Template Metaprogramming Approach to Support Parallel Programs for Multicores

In advent of multicore era, plain C/C++ programming language can not fully reflect computer architectures any more. Source-to-source transformation helps tailor programs close to contemporary hardwares. We propose a template-based approach to perform transformation for programs with rich static information. The template metaprogramming techniques we present can conduct parallelization and memory hierarchical optimization for specific […]
Jan, 17

Four-dimensional Cone Beam CT Reconstruction and Enhancement using a Temporal Non-Local Means Method

Four-dimensional Cone Beam Computed Tomography (4D-CBCT) has been developed to provide respiratory phase resolved volumetric imaging in image guided radiation therapy (IGRT). Inadequate number of projections in each phase bin results in low quality 4D-CBCT images with obvious streaking artifacts. In this work, we propose two novel 4D-CBCT algorithms: an iterative reconstruction algorithm and an […]
Jan, 17

Finding Convex Hulls Using Quickhull on the GPU

We present a convex hull algorithm that is accelerated on commodity graphics hardware. We analyze and identify the hurdles of writing a recursive divide and conquer algorithm on the GPU and divise a framework for representing this class of problems. Our framework transforms the recursive splitting step into a permutation step that is well-suited for […]
Jan, 16

Programming on Parallel Machines: GPU, Multicore, Clusters and More

This open-source textbook on parallel programming is aimed more on the practical end of things, in that: There is very little theoretical content, such as O() analysis, maximum theoretical speedup, acyclic graphs and so on; Real code is featured throughout; We use the main parallel platforms-OpenMP, CUDA and MPI-rather than languages that at this stage […]
Jan, 16

FPGA Based Acceleration of Decimal Operations

Field Programmable Gate-Arrays (FPGAs) can efficiently implement application specific processors in non-conventional number systems, such as the decimal (Binary-Coded Decimal, or BCD) number system required for accounting accuracy in financial applications. The main purpose of this work is to show that applications requiring several decimal (BCD) operations can be accelerated by a processor implemented on […]
Jan, 16

A Modular System Architecture for Online Parallel Vision Pipelines

We present an architecture for real-time, online vision systems which enables development and use of complex vision pipelines integrating any number of algorithms. Individual algorithms are implemented using modular plugins, allowing integration of independently developed algorithms and rapid testing of new vision pipeline configurations. The architecture exploits the parallelization of graphics processing units (GPUs) and […]
Jan, 16

Fast Regularization of Matrix-Valued Images

Regularization of matrix-valued data is of importance in medical imaging, motion analysis and scene understanding. In this report we describe a novel method for efficient regularization of matrix group-valued images. Using the augmented Lagrangian framework we separate the total-variation regularization of matrix-valued images into a regularization and projection steps, both of which are fast and […]
Jan, 16

Theano: Deep Learning on GPUs with Python

In this paper, we present Theano, a framework in the Python programming language for defining, optimizing and evaluating expressions involving high-level operations on tensors. Theano offers most of NumPy’s functionality, but adds automatic symbolic differentiation, GPU support, and faster expression evaluation. Theano is a general mathematical tool, but it was developed with the goal of […]
Jan, 16

A Fast Jet Finder Algorithm Using Graphic Processing Unit

A collimated emission of hadrons usually called Jet is the experimental counterparts of the partons (quarks and gluons) which are not observed separately. The CMS detector at LHC is ideally designed to study jet tomography which is an important probe to investigate the hot and dense medium formed during the heavy ion collisions. Although CMS […]
Jan, 16

Data Triage and Visual Analytics for Scientific Visualization

As the speed of computers continues to increase at a very fast rate, the size of data generated from scientific simulations has now reached petabytes ($10^{12}$ bytes) and beyond. Under such circumstances, no existing techniques can be used to perform effective data analysis at a full precision. To analyze large scale data sets, visual analytics […]
Jan, 16

Feature Extraction and Visualization from Higher-Order CFD Data

Computational fluid dynamics (CFD) methods have been employed in the studies of subjects such as aeroacoustics, gas dynamics, turbo machinery, viscoelastic fluids, among others. However, the need for accuracy and high performance resulted in methods whose solutions are becoming increasingly more complex. In this context, feature extraction and visualization methods play a key role, making […]

* * *

* * *

HGPU group © 2010-2024 hgpu.org

All rights belong to the respective authors

Contact us: