Papers on hgpu.org (.txt-file)
An acceleration of the algorithm for the nurse rerostering problem on a graphics processing unit
An Accelerator based on the rho-VEX Processor: an Exploration using OpenCL
An adaptative game loop architecture with automatic distribution of tasks between CPU and GPU
An Adaptative Multi-GPU based Branch-and-Bound. A Case Study: the Flow-Shop Scheduling Problem
An adaptive Expectation-Maximization algorithm with GPU implementation for electron cryomicroscopy
An Adaptive Framework for Managing Heterogeneous Many-Core Clusters
An adaptive framework for visualizing unstructured grids with time-varying scalar fields
An Adaptive Hybrid Multiprocessor technique for bioinformatics sequence alignment
An Adaptive Multi-Spline Refinement Algorithm in Simulation Based Sailboat Trajectory Optimization Using Onboard Multi-Core Computer Systems
An Adaptive Multiresolution Mesh Representation for CPU-GPU Coupled Computation
An adaptive octree textures painting algorithm
An adaptive performance modeling tool for GPU architectures
An Adaptive Step Size GPU ODE Solver for Simulating the Electric Cardiac Activity
An algebraic parallel treecode in arbitrary dimensions
An Algorithm for Detecting Cycles in Undirected Graphs using CUDA Technology
An Algorithm for Fast Edit Distance Computation on GPUs
An algorithm-architecture co-design framework for gridding reconstruction using FPGAs
An Analysis of Conventional and Heterogeneous Workloads on Production Supercomputing Resources
An Analysis of OpenACC Programming Model: Image Processing Algorithms as a Case Study
An Analysis of Programmer Productivity versus Performance for High Level Data Parallel Programming
An Analysis of Variation Between Cores For Intel Xeon Phi Knights Corner And Xeon Phi Knights Landing
An Analytical Approach of Mars Rovers by Using GPU Technology and Genetic Algorithm
An analytical model for a GPU architecture with memory-level and thread-level parallelism awareness
An application of graphical numerical accelerators in simulations of ion-transport through biological membranes
An Approach for Maximizing Performance on Heterogeneous Clusters of CPU and GPU
An approach for the effective utilization of GP-GPUs in parallel combined simulation
An Approach for Traffic Forecast with GPU Computing & Cellular Automata Model
An approach of tool paths generation for CNC machining based on CUDA
An Approach to Efficient FEM Simulations on Graphics Processing Units Using CUDA
An approach to performance portability through generic programming
An Architectural Journey into RISC Architectures for HPC Workloads
An architecture design of GPU-accelerated VoD streaming servers with network coding
An Architecture for Distributed Behavioral Models with GPUs
An architecture for real time fluid simulation using multiple GPUs
An asymmetric distributed shared memory model for heterogeneous parallel systems
An Asynchronous Dataflow-Driven Execution Model For Distributed Accelerator Computing
An Asynchronous Event Communication Technique for Soft Real-Time GPGPU Applications
An Auto-Programming Approach to Vulkan
An Auto-tuned Method for Solving Large Tridiagonal Systems on the GPU
An auto-tuning framework for parallel multicore stencil computations
An Auto-tuning Solution to Data Streams Clustering in OpenCL
An Automated Approach for SIMD Kernel Generation for GPU based Software Acceleration
An Automated Tool for Converting Directive Based C Code Into Parallel CUDA Code
An Automated Video Surveillance System Using Viewpoint Feature Histogram and CUDA-enabled GPUs
An Automatic Host and Device Memory Allocation Method for OpenMPC
An Automatic Input-Sensitive Approach for Heterogeneous Task Partitioning
An Automatic OpenCL Compute Kernel Generator for Basic Linear Algebra Operations
An Automatic Speech Recognition Application Framework for Highly Parallel Implementations on the GPU
An Autotuning Framework for Intel Xeon Phi Platforms
An effective GPU implementation of breadth-first search
An Effective Model of CPU/GPU Collaborative Computing in GPU Clusters
An Efficient Acceleration of Digital Fonensics Search Using GPGPU
An Efficient Approach for Generating Pencil Filter and Its Implementation on GPU
An Efficient Block Cipher Implementation on Many-Core Graphics Processing Units
An Efficient Cell List Implementation for Monte Carlo Simulation on GPUs
An Efficient Common Substrings Algorithm for On-the-Fly Behavior-Based Malware Detection and Analysis
An Efficient Deterministic Parallel Algorithm for Adaptive Multidimensional Numerical Integration on GPUs
An Efficient Dispatcher for Large Scale GraphProcessing on OpenCL-based FPGAs
An Efficient Fine-grained Parallel Genetic Algorithm Based on GPU-Accelerated
An efficient GPU acceptance-rejection algorithm for the selection of the next reaction to occur for Stochastic Simulation Algorithms
An efficient GPU algorithm for tetrahedron-based Brillouin-zone integration
An Efficient GPU Implementation of Modified Discrete Cosine Transform Using CUDA
An efficient GPU implementation of the revised simplex method
An efficient GPU-based approach for interactive global illumination
An efficient GPU-based time domain solver for the acoustic wave equation
An Efficient Hardware Accelerator for Structured Sparse Convolutional Neural Networks on FPGAs
An Efficient Implementation of Double Precision 1-D FFT for GPUs Using CUDA
An Efficient Implementation of GPU Virtualization in High Performance Clusters
An efficient implementation of Smith Waterman algorithm on GPU using CUDA, for massively parallel scanning of sequence databases
An Efficient Implementation of the Longest Common Subsequence Algorithm with Bit-Parallelism on GPUs
An efficient KNN algorithm implemented on FPGA based heterogeneous computing system using OpenCL
An Efficient Load Balancing Method for Tree Algorithms
An efficient midpoint-radius representation format to deal with symmetric fuzzy numbers
An efficient mixed-precision, hybrid CPU-GPU implementation of a fully implicit particle-in-cell algorithm
An efficient MPI/OpenMP parallelization of the Hartree-Fock method for the second generation of Intel Xeon Phi processor
An Efficient Multiway Mergesort for GPU Architectures
An efficient numerical method for solving the Boltzmann equation in multidimensions
An efficient out-of-core volume rendering method based on ray casting and GPU acceleration
An efficient parallel algorithm for accelerating computational protein design
An Efficient Parallel Algorithm for Graph Isomorphism on GPU using CUDA
An Efficient Parallel Data Clustering Algorithm Using Isoperimetric Number of Trees
An Efficient Parallel GPU Evaluation of Small Angle X-Ray Scattering Profiles
An Efficient Parallel ISODATA Algorithm Based on Kepler GPUs
An Efficient Parallel Motion Estimation Algorithm and X264 Parallelization in CUDA
An Efficient SAR Processor Based on GPU via CUDA
An efficient scheduling scheme using estimated execution time for heterogeneous computing systems
An Efficient Signal Processor of Synthetic Aperture Radar Based on GPU
An Efficient Simulation Environment for Modeling Large-Scale Cortical Processing
An efficient solution for hazardous geophysical flows simulation using GPUs
An efficient stochastic approach to groupwise non-rigid image registration
An Efficient Stream Buffer Mechanism for Dataflow Execution on Heterogeneous Platforms with GPUs
An Efficient Work-Distribution Strategy for Gridding Radio-Telescope Data on GPUs
An Efficient WSN Simulator for GPU-Based Node Performance
An Efficient, Automatic Approach to High Performance Heterogeneous Computing
An efficient, model-based CPU-GPU heterogeneous FFT library
Titles: 100
open PDFs: 83
packages: 9