Papers on hgpu.org (.txt-file)
Exploring data flow design and vectorization with oneAPI for streaming applications on CPU+GPU

Exploring Design Space of 3D NVM and eDRAM Caches Using DESTINY Tool (open-source code)

Exploring Different Automata Representations for Efficient Regular Expression Matching on GPUs

Exploring Fine-Grained Task-based Execution on Multi-GPU Systems

Exploring FPGA Optimizations to Compute Sparse Numerical Linear Algebra Kernels

Exploring FPGA-specific Optimizations for Irregular OpenCL Applications

Exploring GPGPU Acceleration of Process-Oriented Simulations

Exploring GPGPU workloads: Characterization methodology, analysis and microarchitecture evaluation implications

Exploring GPGPUs Workload Characteristics and Power Consumption

Exploring GPU Memory Performance Using Digital Image Processing Algorithms

Exploring GPU-to-GPU Communication: Insights into Supercomputer Interconnects

Exploring Graphics Processing Unit (GPU) Resource Sharing Efficiency for High Performance Computing

Exploring graphics processing units as parallel coprocessors for online aggregation
Exploring graphics processor performance for general purpose applications

Exploring Heterogeneous Scheduling using the Task-Centric Programming Model

Exploring High Performance SQL Databases with Graphics Processing Units

Exploring LLVM Infrastructure for Simplified Multi-GPU Programming

Exploring Many-Core Design Templates for FPGAs and ASICs

Exploring Microcontrollers in GPUs

Exploring Multi-level Parallelism for Large-Scale Spiking Neural Networks

Exploring Multiple Dimensions of Parallelism in Junction Tree Message Passing

Exploring Multiple Levels of Performance Modeling for Heterogeneous Systems

Exploring new architectures in accelerating CFD for Air Force applications

Exploring Novel Parallelization Technologies for 3-D Imaging Applications

Exploring Optimisations for the Local Assembly phase of Finite Element Methods on GPUs

Exploring Parallel Algorithms for Volumetric Mass-Spring-Damper Models in CUDA

Exploring Portability and Performance of OpenCL FPGA Kernels on Intel HARPv2

Exploring power efficiency and optimizations targeting heterogeneous applications

Exploring Programming Multi-GPUs using OpenMP & OpenACC-based Hybrid Model

Exploring reconfigurable architectures for explicit finite difference option pricing models

Exploring Reconfigurable Architectures for Tree-Based Option Pricing Models

Exploring Scalability in C++ Parallel STL Implementations

Exploring scalability of FIR filter realizations on Graphics Processing Units
Exploring SIMD for Molecular Dynamics, Using Intel Xeon Processors and Intel Xeon Phi Coprocessors

Exploring SYCL as a Portability Layer for High-Performance Computing on CPUs

Exploring SYCL for batched kernels with memory allocations

Exploring Task Parallelism for Heterogeneous Systems Using Multicore Task Management API

Exploring the acceleration of Nekbone on reconfigurable architectures

Exploring the Feasibility of Fully Homomorphic Encryption

Exploring The Latency and Bandwidth Tolerance of CUDA Applications

Exploring the Limits of Generic Code Execution on GPUs via Direct (OpenMP) Offload

Exploring the Limits of GPUs With Parallel Graph Algorithms

Exploring the Millennium Run – Scalable Rendering of Large-Scale Cosmological Datasets

Exploring the multiple-GPU design space

Exploring the Multitude of Real-Time Multi-GPU Configurations

Exploring the Optimization Space of Multi-Core Architectures with OpenCL Benchmarks

Exploring the power of GPU’s for training Deep Belief Networks

Exploring the Suitability of Remote GPGPU Virtualization for the OpenACC Programming Model Using rCUDA

Exploring the tradeoffs between programmability and efficiency in data-parallel accelerators

Exploring the use of glossy light volumes for interactive global illumination

Exploring Thread Coarsening on FPGA

Exploring Traditional and Emerging Parallel Programming Models using a Proxy Application

Exploring utilisation of GPU for database applications

Exploring weak scalability for FEM calculations on a GPU-enhanced cluster

Exponential integrators on graphic processing units

Exponential Integrators on Graphics Processing Units

Exposing Errors Related to Weak Memory in GPU Applications

Exposing Fine-Grained Parallelism in Algebraic Multigrid Methods

Exposing non-standard architectures to embedded software using compile-time virtualisation

Exposure Render: An Interactive Photo-Realistic Volume Rendering Framework

Expressed Sequence Tag Clustering using Commercial Gaming Hardware

Expressive Array Constructs in an Embedded GPU Kernel Programming Language

Extendable pattern-oriented optimization directives

Extendable Pattern-Oriented Optimization Directives (extended version)

Extended Data Collection: Analysis of Cache Behavior and Performance of Different BVH Memory Layouts for Tracing Incoherent Rays

Extended Dynamic Programming and Fast Multidimensional Search Algorithm for Energy Minization in Stereo and Motion

Extended-precision floating-point numbers for GPU computation

Extending a C-like Language for Portable SIMD Programming

Extending a Run-time Resource Management framework to support OpenCL and Heterogeneous Systems

Extending abstract GPU APIs to shared memory

Extending adaptive sparse grids for stochastic collocation to hybrid parallel architectures

Extending High-Level Synthesis for Task-Parallel Programs

Extending Lyapack for the Solution of Band Lyapunov Equations on Hybrid CPU-GPU Platforms

Extending MAGMA Portability with OneAPI

Extending OmpSs for OpenCL kernel co-execution in heterogeneous systems

Extending OmpSs to support CUDA and OpenCL in C, C++ and Fortran Applications

Extending Scala with General Purpose GPU Programming

Extending SYCL’s Programming Paradigm with Tensor-based SIMD Abstractions

Extending the Computational Application of Reaction-Diffusion Chemistry by Modelling Artificial Neural Networks

Extending the Generalized Fermat Prime Number Search Beyond One Million Digits Using GPUs

Extending the Gotran framework: LATEX and GPU acceleration

Extending the Scalability of Single Chip Stream Processors with On-chip Caches

Extending the SkelCL Skeleton Library for Stencil Computations on Multi-GPU Systems

Extension of the SkePU Skeleton Programming Framework for Multi-core CPU and Multi-GPU Systems for MPI-based Clusters

Extensions and Limitations of the Neural GPU

Extensions of Parallel Coordinates for Interactive Exploration of Large Multi-Timepoint Data Sets

Extinction-Based Shading and Illumination in GPU Volume Ray-Casting

Extracting Flow Features Using Bag-of-Features and Supervised Learning Techniques

Extracting Maximal Exact Matches on GPU

Extremely fast simulator for decoding LDPC codes
Extremely large scale simulation of a Kardar-Parisi-Zhang model using graphics cards

Eye-Full Tower: A GPU-based variable multibaseline omnidirectional stereovision system with automatic baseline selection for outdoor mobile robot navigation

Face Detection CUDA Accelerating

Face Detection for Human Identification in Surveillance

Face Detection with Improved Local Binary Patterns in CUDA

Face Recognition with Hybrid Efficient Convolution Algorithms on FPGAs

Titles: 100
open PDFs: 97
packages: 17
