Papers on hgpu.org (.txt-file)
CuNeuQuant: A CUDA Implementation of the NeuQuant Image Quantization Algorithm
CuParcone A High-Performance Evolvable Neural Network Model
CuPBoP-AMD: Extending CUDA to AMD Platforms
CuPBoP: CUDA for Parallelized and Broad-range Processors
cuPC: CUDA-based Parallel PC Algorithm for Causal Structure Learning on GPU
cuPentBatch – A batched pentadiagonal solver for NVIDIA GPUs
CuPP – A framework for easy CUDA integration
cuPSO: GPU Parallelization for Particle Swarm Optimization Algorithms
CURFIL: Random Forests for Image Labeling on GPU
Curling and clumping fur represented by texture layers
Curracurrong: a stream processing system for distributed environments
Current and Nascent SETI Instruments in the Radio and Optical
CUSA and CUDE: GPU-accelerated methods for estimating solvent accessible surface area and desolvation
cusFFT: A High-Performance Sparse Fast Fourier Transform Algorithm on GPUs
CUSHAW: a CUDA compatible short read aligner to large genomes based on the Burrows-Wheeler transform
CUSIMANN: An optimized simulated annealing software for GPUs
cuSLINK: Single-linkage Agglomerative Clustering on the GPU
cuSten – CUDA Finite Difference and Stencil Library
Custom Code Generation for a Graph DSL
Customizable Domain-Specific Computing
Customizable Memory Schemes for Data Parallel Accelerators
Customization of OpenCL Applications for Efficient Task Mapping under Heterogeneous Platform Constraints
Customizing Driving Directions with GPUs
Customizing Instruction Set Extensible Reconfigurable Processors using GPUs
cuSZ-I: High-Fidelity Error-Bounded Lossy Compression for Scientific Data on GPUs
cuSZ(x): Optimizing Error-Bounded Lossy Compression for Scientific Data on GPUs
CUTE solutions for two-point correlation functions from large cosmological datasets
cuTT: A High-Performance Tensor Transpose Library for CUDA Compatible GPUs
CUVLE: Variable-Length Encoding on CUDA
cuZK: Accelerating Zero-Knowledge Proof with A Faster Parallel Multi-Scalar Multiplication Algorithm on GPUs
CVC: The Contourlet Video Compression algorithm for real-time applications
CVPI: A Computer Vision Library For Mobile and Embedded Platforms
Cyclic Reduction Tridiagonal Solvers on GPUs Applied to Mixed-Precision Multigrid
Cytochrome P450 site of metabolism prediction from 2D topological fingerprints using GPU accelerated probabilistic classifiers
CytonRL: an Efficient Reinforcement Learning Open-source Toolkit Implemented in C++
D-face: Parallel Implementation of CNN Based Face Classifier using Drone Data On K40 & Jetson TK1
D5.5.2 – Architectural Techniques to exploit SLACK & ACCURACY trade-offs
D5.5.3 – Design and implementation of the SIMD-MIMD GPU architecture
D5.5.4 – Characterization of Redundancy and Definition of Work Reuse
Daino: A High-level Framework for Parallel and Efficient AMR on GPUs
Daisen: A Framework for Visualizing Detailed GPU Execution
DAMS: distributed adaptive metaheuristic selection
Dandelion: a Compiler and Runtime for Heterogeneous Systems
Dank Learning: Generating Memes Using Deep Neural Networks
Dark Sky Simulations: Early Data Release
Darknet on OpenCL: a multi-platform tool for object detection and classification
DarKnight: An Accelerated Framework for Privacy and Integrity Preserving Deep Learning Using Trusted Hardware
Data access optimized applications on the GPU using NVIDIA CUDA
Data Acquisition with GPUs: The DAQ for the Muon g-2 Experiment at Fermilab
Data analysis and 3D evolution in High Energy Physics using graphic processor
Data Analysis of Minimally-Structured Heterogeneous Logs: An experimental study of log template extraction and anomaly detection based on Recurrent Neural Network and Naive Bayes
Data Assimilation using a GPU Accelerated Path Integral Monte Carlo Approach
Data Buffering Optimization Methods toward a Uniform Programming Interface for GPU-based Applications
Data Coherence Analysis and Optimization for Heterogeneous Computing
Data Compression using CUDA programming in GPU
Data driven scheduling approach for the multi-node multi-GPU Cholesky decomposition
Data handling inefficiencies between CUDA, 3D rendering, and system memory
Data Layout Optimization for Multi-Valued Containers in OpenCL
Data Layout Oriented Compilation Techniques in Vectorization for Multi-/Many-cores
Data Layout Transformation Exploiting Memory-Level Parallelism in Structured Grid Many-Core Applications
Data Layout Transformation for Structured-Grid Codes on GPU
Data Mining and Machine Learning in Astronomy
Data Mining Techniques in Parallel and Distributed Environment – A Comprehensive Survey
Data Mining Using Graphics Processing Units
Data Movement Optimization for High-Performance Computing
Data parallel acceleration of decision support queries using Cell/BE and GPUs
Data Parallel C++: Mastering DPC++ for Programming of Heterogeneous Systems using C++ and SYCL
Data parallel execution challenges and runtime performance of agent simulations on GPUs
Data parallel loop statement extension to CUDA: GpuC
Data parallel patterns on CPU/GPU mix
Data Parallel Quadtree Indexing and Spatial Query Processing of Complex Polygon Data on GPUs
Data Parallel Three-Dimensional Cahn-Hilliard Field Equation Simulation on GPUs with CUDA
Data Parallelism Exploiting for H.264 Encoder
Data Partitioning on Heterogeneous Multicore and Multi-GPU Systems Using Functional Performance Models of Data-Parallel Applications
Data registration module – a component of semantic simulation engine
Data Regression with Normal Equation on GPU using CUDA
Data Remanence and Digital Forensic Investigation for CUDA Graphics Processing Units
Data Sorting Using Graphics Processing Units
Data Stream Classification using Random Feature Functions and Novel Method Combinations
Data structure design for GPU based heterogeneous systems
Data Structures and Algorithms for Counting Problems on Graphs using GPU
Data Structures and Transformations for Physically Based Simulation on a GPU
Data Structures for Task-based Priority Scheduling
Data Transfer Matters for GPU Computing
Data transfer optimizations for heterogeneous managed runtime systems
Data transformations enabling loop vectorization on multithreaded data parallel architectures
Data Triage and Visual Analytics for Scientific Visualization
Data Visualization and Mining using the GPU
Data-aware scheduling of legacy kernels on heterogeneous platforms with distributed memory
Data-Aware Task Scheduling on Multi-accelerator Based Platforms
Data-Driven Analysis and Design of Vulkan Ray-Tracing Applications using Automatic Instrumentation
Data-Driven Programming Abstractions and Optimization for Multi-Core Platforms
Data-driven versus Topology-driven Irregular Computations on GPUs
Data-intensive document clustering on GPU clusters
Data-intensive document clustering on graphics processing unit (GPU) clusters
Data-Oriented Language Implementation of Lattice-Boltzmann Method for Dense and Sparse Geometries
Data-parallel Acceleration of PARSEC Black-Scholes Benchmark
Titles: 100
open PDFs: 94
packages: 25