7763

Posts

May, 29

A GPU-Based Track-Repeating Algorithm for Dose Calculation for Photon Radiotherapy

An essential ingredient in radiotherapy is the calculation of the dose to be delivered to the patient. Analytical algorithms are commonly used for such a task, however their accuracy is not always satisfactory. Monte Carlo techniques provide higher accuracy, but they often require large computational times. Track-repeating algorithms, for example the Fast Dose Calculator, have […]
May, 29

Hybrid Update Algorithms for Regular Lattice and Small-World Ising Models on Graphical Processing Units

Local and cluster Monte Carlo update algorithms offer a complex tradeoff space for optimising the performance of simulations of the Ising model. We systematically explore tradeoffs between hybrid Metropolis and Wolff cluster updates for the 3D Ising model using data-parallelism and graphical processing units. We investigate performance for both regular lattices as well as for […]
May, 29

CUDA Implementation of Parallel Algorithms for Animal Noseprint Identification

Concern about the threats posed by natural proliferation of animal-borne human diseases like BSE ("mad cow disease") and by the possible use of animals as disease vectors in bioterrorism, have spurred heightened interest in the development of methods for rapid automated identification of individual animals of various societally and commercially important mammalian species. Just as […]
May, 29

A fight for performance and accuracy of the matrix multiplication routines: CUBLAS on Nvidia Tesla versus MKL and ATLAS on Intel Nehalem

Scientific computation relies heavily on 64 bits arithmetic. The evolution of the Graphical Processing Units to the status of massively micro-parallel vector units and the improvement of their programmability make them stand as powerfull algebraic coprocessors for many classes of matrix calculus. But on these processors inheriting from architectures dedicated to video processing in the […]
May, 29

Using OpenCL to Calculate a Pressure Field

This report details the project in converting a CUDA program into an OpenCL program that would be adaptable to many platforms. Originally the CUDA program could only be ran on a NVIDA graphics card, which did not make the program very applicable for the user. Throughout this project the above authors learned how to program […]
May, 29

Massively Parallel Neural Encoding and Decoding of Visual Stimuli

The massively parallel nature of video Time Encoding Machines (TEMs) calls for scalable, massively parallel decoders that are implemented with neural components. The current generation of decoding algorithms is based on computing the pseudo-inverse of a matrix and does not satisfy these requirements. Here we consider video TEMs with an architecture built using Gabor receptive […]
May, 29

Time-dependent density-functional theory in massively parallel computer architectures: the OCTOPUS project

Octopus is a general-purpose density-functional theory (DFT) code, with a particular emphasis on the time-dependent version of DFT (TDDFT). In this paper we present the ongoing efforts to achieve the parallelization of octopus. We focus on the real-time variant of TDDFT, where the time-dependent Kohn-Sham equations are directly propagated in time. This approach has great […]
May, 29

Explicit Cache Management for Volume Ray-Casting on Parallel Architectures

A major challenge when designing general purpose graphics hardware is to allow efficient access to texture data. Although different rendering paradigms vary with respect to their data access patterns, there is no flexibility when it comes to data caching provided by the graphics architecture. In this paper we focus on volume ray-casting, and show the […]
May, 27

The Third International Workshop on Frontier of GPU Computing, FGC 2012

To be held in conjunction with HPCC 2012 The goal of this workshop is to provide a forum for researchers and practitioners to discuss and share their research and development experiences and outputs on the massively parallel GPU platforms, software development tools, optimization techniques, parallel algorithm design, and all kinds of successful applications. We solicit […]
May, 26

Parameterized Verification of GPU Kernel Programs

We present an automated symbolic verifier for checking the functional correctness of GPGPU kernels parametrically, for an arbitrary number of threads. Our tool PUGpara checks the functional equivalence of a kernel and its optimized versions, helping debug errors introduced during memory coalescing and bank conflict elimination related optimizations. Key features of our work include: (1) […]
May, 26

Parallel Parametric Optimisation with Firefly Algorithms on Graphical Processing Units

Parametric optimisation techniques such as Particle Swarm Optimisation (PSO), Firefly algorithms (FAs), genetic algorithms (GAs) are at the centre of attention in a range of optimisation problems where local minima plague the parameter space. Variants of these algorithms deal with the problems presented by local minima in a variety of ways. A salient feature in […]
May, 26

Routine Microsecond Molecular Dynamics Simulations with AMBER on GPUs. 1. Generalized Born

We present an implementation of generalized Born implicit solvent all-atom classical molecular dynamics (MD) within the AMBER program package that runs entirely on CUDA enabled NVIDIA graphics processing units (GPUs). We discuss the algorithms that are used to exploit the processing power of the GPUs and show the performance that can be achieved in comparison […]

Recent source codes

* * *

* * *

HGPU group © 2010-2025 hgpu.org

All rights belong to the respective authors

Contact us: