Papers on hgpu.org (.txt-file)

An experimental study of group-by and aggregation on CPU-GPU processors Download

An Experimental Study of SYCL Task Graph Parallelism for Large-Scale Machine Learning Workloads Download

An experimental study on performance portability of OpenCL kernels Download

An Explicit Algorithm for Porous Media Flow Simulation using GPUs Download Package

An exploration of CUDA and CBEA for a gravitational wave data-analysis application (Einstein@Home) Download Package

An exploration of CUDA and CBEA for a gravitational wave source-modelling application Download

An Exploration of OpenCL for a Numerical Relativity Application Download

An Exploration of OpenCL on Multiple Hardware Platforms for a Numerical Relativity Application Download

An Exploratory Study of High Performance Graphics Application Programming Interfaces Download

An extended GPU radiosity solver Download

An Extensible Component-based Approach to Simulation Systems on Heterogeneous Clusters Download

An Extensible Framework for Composing Stencils with Common Scientific Computing Patterns Download

An Extension of the StarSs Programming Model for Platforms with Multiple GPUs Download

An FPGA Accelerator for Molecular Dynamics Simulation Using OpenCL Download

An FPGA Implementation of Information Theoretic Visual-Saliency System and Its Optimization

An FPGA-based processing pipeline for high definition stereo video Download

An FPGA-based Torus Communication Network Download

An FPGA-specific algorithm for direct generation of multi-variate Gaussian random numbers

An hardware architecture for 3D object tracking and motion estimation Download

An hybrid AES-256-GCM implementation for NEON CPU & CUDA GPU Download Package

An Image Enhancing Pattern-based Sparsity for Real-time Inference on Mobile Devices Download Package

An image-warping VR-architecture: design, implementation and applications Download

An implementation and its evaluation of password cracking tool parallelized on GPGPU

An implementation for quad-tree based solid object coloring using CUDA Download

An implementation of a reordering approach for increasing the product of diagonal entries in a sparse matrix Download Package

An Implementation of Coincidence Algorithm on Graphic Processing Units Download

An Implementation of Conflict-Free Offline Permutation on the GPU Download

An Implementation of Differential Evolution for Independent Tasks Scheduling on GPU Download

An implementation of level set based topology optimization using GPU Download

An Implementation of Real-Time Phased Array Radar Fundamental Functions on a DSP-Focused, High-Performance, Embedded Computing Platform Download

An Implementation of the Discontinuous Galerkin Method on Graphics Processing Units Download Package

An Implementation of the Smooth Particle Mesh Ewald Method on GPU Hardware

An implementation of the tile QR factorization for a GPU and multiple CPUs Download

An implicit multigrid solver for high-order compressible flow simulations on GPUs Download

An implicit Tensor-Mass solver on the GPU for soft bodies simulation Download

An Improved CUDA-Based Implementation of Differential Evolution on GPU Download

An Improved Image Segmentation Algorithm Based on GPU Parallel Computing Download

An improved implementation of Preconditioned Conjugate Gradient Method on GPU Download

An Improved Magma Gemm For Fermi Graphics Processing Units Download Package

An Improved Monte Carlo Ray Tracing for Large-Scale Rendering in Hadoop Download

An Improved Parallel Algorithm using GPU for Siting Observers on Terrain Download

An improved parallel contrast-aware halftoning Download

An Improved Parallel Implementation of 3D DRIE Simulation on GPU Download

An improved scheme of an interactive finite element model for 3D soft-tissue cutting and deformation Download

An Improved Study of Physically Based Fluid Simulation on GPU

An improved study of real-time fluid simulation on GPU Download

An improved visual inspection system using visual servo

An in-depth performance analysis of irregular workloads on VLIW APU Download

An In-depth Performance Characterization of CPU- and GPU-based DNN Training on Modern Architectures Download

An Incompressible Navier-Stokes Equations Solver on the GPU Using CUDA Download

An initial performance review of software components for a heterogeneous computing platform Download

An innovative compilation tool-chain for embedded multi-core architectures Download

An instruction-systolic programmable shader architecture for multi-threaded 3D graphics processing

An Integrated Framework for Feature Extraction, Object Recognition and Stereo Vision with GPU support Download

An integrated GPU power and performance model Download

An intelligent semi-automatic application porting system for application accelerators

An intelligent system for accelerating parallel SVM classification problems on large datasets using GPU

An Interest Point Based Illumination Condition Matching Approach to Photometric Registration Within Augmented Reality Worlds Download

An Interface for Halo Exchange Pattern Download

An Intermediate Library for Multi-GPUs Computing Skeletons Download

An Interrupt-Driven Work-Sharing For-Loop Scheduler Download Package

An Introduction to GPU Accelerated Surgical Simulation Download

An Introduction to High Performance Computing on AWS Download

An Introduction to OpenCL C++ Download

An Introduction to the OpenCL Programming Model Download

An introductory tour of interactive rendering Download

An Investigation into Concurrent Expectation Propagation Download

An Investigation of Atomic Synchronization for Sort-Based Group-By Aggregation on GPUs Download

An investigation of GPU-based stiff chemical kinetics integration methods Download

An Investigation of the Performance Portability of OpenCL Download Package

An Investigation of Unified Memory Access Performance in CUDA Download

An MDE Approach for Automatic Code Generation from MARTE to OpenCL Download

An MPI-Based Python Framework for Distributed Training with Keras Download Package

An MPI-CUDA Implementation and Optimization for Parallel Sparse Equations and Least Squares (LSQR) Download

An MPI-CUDA Implementation for Massively Parallel Incompressible Flow Computations on Multi-GPU Clusters Download

An MPI-CUDA Implementation for the Compression of DEM Download

An MPI-CUDA implementation of an improved Roe method for two-layer shallow water systems Download

An N log N Parallel Fast Direct Solver for Kernel Matrices Download Package

An octree-based proxy for collision detection in large-scale particle systems Download

An On-Demand Fast Parallel Pseudo Random Number Generator with Applications Download

An open framework for rapid prototyping of signal processing applications Download

An open source finite-difference time-domain solver for room acoustics using graphics processing units Download Package

An open source MATLAB program for fast numerical Feynman integral calculations for open quantum system dynamics on GPUs Download

An Open-source FPGA Library for Data Sorting Download Package

An Open-Source GPU-Accelerated Feature Extraction Tool Download

An OpenCL 3D FFT for Molecular Dynamics Simulations on Multiple FPGAs Download

An OpenCL design of the Bob Jenkins lookup3 hash function using the Xilinx SDAccel Development Environment Download

An OpenCL Fast Fourier Transformation Download

An OpenCL framework for heterogeneous multicores with local memory

An OpenCL implementation for the solution of TDSE on GPU and CPU architectures Download

An OpenCL implementation of a forward sampling algorithm for CP-logic Download

An OpenCL Method of Parallel Sorting Algorithms for GPU Architecture Download

An OpenCL Runtime and Scheduler for Embedded Multicore DSP Parallel Systems Download

An OpenCL-Based FPGA Accelerator for Faster R-CNN Download Package

An OpenCL-based Implementation of H.264 Encoder Download

An OpenCL-based Monte Carlo dose calculation engine (oclMC) for coupled photon-electron transport Download

An OpenCL(TM) Deep Learning Accelerator on Arria 10 Download

An OpenMP Programming Environment on Mobile Devices Download

An optimal k-exclusion real-time locking protocol motivated by multi-GPU systems Download

An Optimal Offline Permutation Algorithm on the Hierarchical Memory Machine, with the GPU implementation Download


Brief statistics for this page

Titles: 100

Download open PDFs: 90

Package packages: 14

* * *

* * *

HGPU group © 2010-2023 hgpu.org

All rights belong to the respective authors

Contact us: