Papers on hgpu.org (.txt-file)
Cytochrome P450 site of metabolism prediction from 2D topological fingerprints using GPU accelerated probabilistic classifiers
CytonRL: an Efficient Reinforcement Learning Open-source Toolkit Implemented in C++
D-face: Parallel Implementation of CNN Based Face Classifier using Drone Data On K40 & Jetson TK1
D5.5.2 – Architectural Techniques to exploit SLACK & ACCURACY trade-offs
D5.5.3 – Design and implementation of the SIMD-MIMD GPU architecture
D5.5.4 – Characterization of Redundancy and Definition of Work Reuse
Daino: A High-level Framework for Parallel and Efficient AMR on GPUs
Daisen: A Framework for Visualizing Detailed GPU Execution
DAMS: distributed adaptive metaheuristic selection
Dandelion: a Compiler and Runtime for Heterogeneous Systems
Dank Learning: Generating Memes Using Deep Neural Networks
Dark Sky Simulations: Early Data Release
Darknet on OpenCL: a multi-platform tool for object detection and classification
DarKnight: An Accelerated Framework for Privacy and Integrity Preserving Deep Learning Using Trusted Hardware
Data access optimized applications on the GPU using NVIDIA CUDA
Data Acquisition with GPUs: The DAQ for the Muon g-2 Experiment at Fermilab
Data analysis and 3D evolution in High Energy Physics using graphic processor
Data Analysis of Minimally-Structured Heterogeneous Logs: An experimental study of log template extraction and anomaly detection based on Recurrent Neural Network and Naive Bayes
Data Assimilation using a GPU Accelerated Path Integral Monte Carlo Approach
Data Buffering Optimization Methods toward a Uniform Programming Interface for GPU-based Applications
Data Coherence Analysis and Optimization for Heterogeneous Computing
Data Compression using CUDA programming in GPU
Data driven scheduling approach for the multi-node multi-GPU Cholesky decomposition
Data handling inefficiencies between CUDA, 3D rendering, and system memory
Data Layout Optimization for Multi-Valued Containers in OpenCL
Data Layout Oriented Compilation Techniques in Vectorization for Multi-/Many-cores
Data Layout Transformation Exploiting Memory-Level Parallelism in Structured Grid Many-Core Applications
Data Layout Transformation for Structured-Grid Codes on GPU
Data Mining and Machine Learning in Astronomy
Data Mining Techniques in Parallel and Distributed Environment – A Comprehensive Survey
Data Mining Using Graphics Processing Units
Data Movement Optimization for High-Performance Computing
Data parallel acceleration of decision support queries using Cell/BE and GPUs
Data Parallel C++: Mastering DPC++ for Programming of Heterogeneous Systems using C++ and SYCL
Data parallel execution challenges and runtime performance of agent simulations on GPUs
Data parallel loop statement extension to CUDA: GpuC
Data parallel patterns on CPU/GPU mix
Data Parallel Quadtree Indexing and Spatial Query Processing of Complex Polygon Data on GPUs
Data Parallel Three-Dimensional Cahn-Hilliard Field Equation Simulation on GPUs with CUDA
Data Parallel Visualization and Rendering on the RAMSES Supercomputer with ANARI
Data Parallelism Exploiting for H.264 Encoder
Data Partitioning on Heterogeneous Multicore and Multi-GPU Systems Using Functional Performance Models of Data-Parallel Applications
Data registration module – a component of semantic simulation engine
Data Regression with Normal Equation on GPU using CUDA
Data Remanence and Digital Forensic Investigation for CUDA Graphics Processing Units
Data Sorting Using Graphics Processing Units
Data Stream Classification using Random Feature Functions and Novel Method Combinations
Data structure design for GPU based heterogeneous systems
Data Structures and Algorithms for Counting Problems on Graphs using GPU
Data Structures and Transformations for Physically Based Simulation on a GPU
Data Structures for Task-based Priority Scheduling
Data Transfer Matters for GPU Computing
Data transfer optimizations for heterogeneous managed runtime systems
Data transformations enabling loop vectorization on multithreaded data parallel architectures
Data Triage and Visual Analytics for Scientific Visualization
Data Visualization and Mining using the GPU
Data-aware scheduling of legacy kernels on heterogeneous platforms with distributed memory
Data-Aware Task Scheduling on Multi-accelerator Based Platforms
Data-Driven Analysis and Design of Vulkan Ray-Tracing Applications using Automatic Instrumentation
Data-Driven Dynamic Autotuning: Optimizing Autotuning Overhead with Prior Tuning Data
Data-driven Forecasting of Deep Learning Performance on GPUs
Data-driven Performance Optimization for Data-intensive Applications
Data-Driven Programming Abstractions and Optimization for Multi-Core Platforms
Data-driven versus Topology-driven Irregular Computations on GPUs
Data-efficient LLM Fine-tuning for Code Generation
Data-intensive document clustering on GPU clusters
Data-intensive document clustering on graphics processing unit (GPU) clusters
Data-Oriented Language Implementation of Lattice-Boltzmann Method for Dense and Sparse Geometries
Data-parallel Acceleration of PARSEC Black-Scholes Benchmark
Data-parallel algorithms and data structures
Data-parallel algorithms for large-scale real-time simulation of the cellular potts model on graphics processing units
Data-Parallel Construction of delta_N-Nets with Maximum Dispersion
Data-Parallel Flattening by Expansion
Data-Parallel Hashing Techniques for GPU Architectures
Data-Parallel Octrees for Surface Reconstruction
Data-Parallelism and GPUs for Lattice Gas Fluid Simulations
Data-rich astronomy: mining synoptic sky surveys
Database Operation Development on the GPU
Dataflow-based Design and Implementation of Image Processing Applications
Dataflow-Based Implementation of Layered Sensing Applications
Dataflow-driven GPU performance projection for multi-kernel transformations
Dataloader Parameter Tuner: An Automated Dataloader Parameter Tuner for Deep Learning Models
Daubechies wavelets for high performance electronic structure calculations: The BigDFT project
Dawn of GPU Era-Potentials of Chaos Theory
DawnCC: a Source-to-Source Automatic Parallelizer of C and C++ Programs
Dax Toolkit: A Proposed Framework for Data Analysis and Visualization at Extreme Scale
DBCSR: A Library for Dense Matrix Multiplications on Distributed GPU-Accelerated Systems
DBMS Index for Hierarchical Data Using Nested Intervals and Residue Classes
DC Power Flow Based Contingency Analysis Using Graphics Processing Units
DC Power Flow Based Contingency Analysis Using Graphics Processing Units (thesis)
DCT-JPEG Image Coding Based on GPU
dCUDA: hardware supported overlap of computation and communication
De-specializing an HLS library for Deep Neural Networks: improvements upon hls4ml
Dealing With Big Data Outside Of The Cloud: GPU Accelerated Sort
Titles: 100
open PDFs: 94
packages: 16