Papers on hgpu.org (.txt-file)
word2ket: Space-efficient Word Embeddings inspired by Quantum Entanglement

Work Efficient Parallel Algorithms for Large Graph Exploration

Work in Progress: Vortex Detection and Visualization for Design of Micro Air Vehicles and Turbomachinery

Work-Efficient Parallel GPU Methods for Single-Source Shortest Paths

Working With Incremental Spatial Data During Parallel (GPU) Computation

Workload Analysis and Efficient OpenCL-based Implementation of SIFT Algorithm on a Smartphone

Workload and network-optimized computing systems
Workload Aware Algorithms for Heterogeneous Platforms

Workload Balancing on Heterogeneous Systems: A Case Study of Sparse Grid Interpolation

Workload Characterization of 3D Games

Workload distribution and balancing in FPGAs and CPUs with OpenCL and TBB

Workload Scheduling on Heterogeneous Devices

Workload-aware Automatic Parallelization for Multi-GPU DNN Training

Worst-Case Execution Time Guarantees for Runtime-Reconfigurable Architectures

WPA/WPA2 Password Security Testing using Graphics Processing Units

Wrinkling Coarse Meshes on the GPU

Writing a modular GPGPU program in Java

Writing a performance-portable matrix multiplication

Writing self-adaptive codes for heterogeneous systems

X-Device Query Processing by Bitwise Distribution

X-toon: an extended toon shader

XBOOLE-CUDA: Fast Boolean Operations on the GPU

Xbox360 Front Side Bus – A 21.6 GB/s End-to-End Interface Design

Xeon Phi: A comparison between the newly introduced MIC architecture and a standard CPU through three types of problems

XeonPhi Meets Astrophysical Fluid Dynamics

XGBoost: Scalable GPU Accelerated Learning

XKaapi: A Runtime System for Data-Flow Task Programming on Heterogeneous Architectures

XMalloc: A Scalable Lock-free Dynamic Memory Allocator for Many-core Machines

XML3D: interactive 3D graphics for the web

XMT-GPU: A PRAM Architecture for Graphics Computation

XSD: Accelerating MapReduce by Harnessing the GPU inside an SSD

YaDiV-an open platform for 3D visualization and 3D segmentation of medical data

YodaNN: An Ultra-Low Power Convolutional Neural Network Accelerator Based on Binary Weights

You Can Type, but You Can’t Hide: A Stealthy GPU-based Keylogger

Ypnos: declarative, parallel structured grid programming

ytopt: Autotuning Scientific Applications for Energy Efficiency at Large Scales

ZAME: Interactive Large-Scale Graph Visualization

Zero-copy I/O processing for low-latency GPU computing

Zeus: Understanding and Optimizing GPU Energy Consumption of DNN Training

Zippy: A Framework for Computation and Visualization on a GPU Cluster
ZNN – A Fast and Scalable Algorithm for Training 3D Convolutional Networks on Multi-Core and Many-Core Shared Memory Machines

Zorua: Enhancing Programming Ease, Portability, and Performance in GPUs by Decoupling Programming Models from Resource Management

ZUCL: A ZYNQ UltraScale+ Framework for OpenCL HLS Applications

Titles: 47
open PDFs: 45
packages: 13
