Papers on hgpu.org (.txt-file)
Concurrent learning of a Probabilistic Graphical Model on the GPU
Concurrent Manipulation of Dynamic Data Structures in OpenCL
Concurrent Number Cruncher: An Efficient Sparse Linear Solver on the GPU
Concurrent query processing in a GPU-based database system
Concurrent Scheduling of High-Level Parallel Programs on Multi-GPU Systems
Concurrent Solutions to Linear Systems using Hybrid CPU/GPU Nodes
Concurrent Task Execution on the Intel Xeon Phi
Conditional component composition for GPU-based systems
Cone-beam Computed tomography image reconstruction based on GPU
Confidential Computing on Heterogeneous Systems: Survey and Implications
Confidentiality Issues on a GPU in a Virtualized Environment
Configuration and Benchmarks of Peer-to-Peer Communication over Gigabit Ethernet and InfiniBand in a Cluster with Intel Xeon Phi Coprocessors
Conflux: Embedding Massively Parallel Semantics in a High-Level Programming Language
Conjugate gradient solvers on Intel Xeon Phi and NVIDIA GPUs
Connected component identification and cluster update on GPU
Connected component labeling on a 2D grid using CUDA
Connected-component identification and cluster update on graphics processing units
Connecting Architecture, Fitness, Optimizations and Performance using an Anisotropic Diffusion Filter
Connectivity-Based Segmentation for GPU-Accelerated Mesh Decompression
Considerations when evaluating microprocessor platforms
Considering GPGPU for HPC Centers: Is It Worth the Effort?
Consolidating Applications for Energy Efficiency in Heterogeneous Computing Systems
Constrained inverse volume rendering for planetary nebulae
Constructing Long Short-Term Memory based Deep Recurrent Neural Networks for Large Vocabulary Speech Recognition
Constructing Natural Neighbor Interpolation Based Grid DEM Using CUDA
Constructing Two-Dimensional Voronoi Diagrams via Divide-and-Conquer of Envelopes in Space
Constructing Two-Dimensional Voronoi Diagrams via Divide-and-Conquer of Envelopes in Space (thesis)
Construction and Implementation of a Simple Agent-Based System on GPU-Architectures
Construction and Rendering of Trimmed Blending Surfaces with Sharp Features on a GPU
Construction of a Virtual Cluster by Integrating PCI Pass-Through for GPU and InfiniBand Virtualization in Cloud
Construction of Efficient Kd-Trees for Static Scenes Using Voxel-visibility Heuristic
Content Based Image Retrieval with Graphical Processing Unit
Context Parallelism for Scalable Million-Token Inference
Context-aware volume navigation
Continual surface-based multi-projector blending for moving objects
Continuous Level of Detail on Graphics Hardware
Continuous Representation of Projected Attribute Spaces of Multifields over Any Spatial Sampling
Contour-based algorithm for vectorization of satellite images
Contouring for Power Systems Using Graphical Processing Units
Contract-Based General-Purpose GPU Programming
Contributions of hybrid architectures to depth imaging: a CPU, APU and GPU comparative study
Contributions to Music Semantic Analysis and Its Acceleration Techniques
Contributions to Parallel Simulation of Equation-Based Models on Graphics Processing Units
Contributions to parallel stochastic simulation: Application of good software engineering practices to the distribution of pseudorandom streams in hybrid Monte-Carlo simulations
Contributions to the Efficient Use of General Purpose Coprocessors: Kernel Density Estimation as Case Study
Convergence and Scalarization for Data-Parallel Architectures
Converting Data to Task-Parallelism by Rewrites
Converting Data-Parallelism to Task-Parallelism by Rewrites: Purely Functional Programs Across Multiple GPUs
Convex Clustering: An Attractive Alternative to Hierarchical Clustering
Convolution of large 3D images on GPU and its decomposition
Convolutional Neural Network for Sentence Classification
Convolutional Neural Network-Based Image Representation for Visual Loop Closure Detection
Convolutional Neural Networks for Human Activity Recognition using Mobile Sensors
Convolutional Neural Networks for Large-Scale Bird Song Classification in Noisy Environment
COOK Access Control on an embedded Volta GPU
Cooperative CPU, GPU, and FPGA heterogeneous execution with EngineCL
Cooperative Heterogeneous Computing for Parallel Processing on CPU/GPU Hybrids
Cooperative Kernels: GPU Multitasking for Blocking Algorithms
Cooperative Multitasking for GPU-Accelerated Grid Systems
Coordinate strip-mining and kernel fusion to lower power consumption on GPU
Coordinated system level resource management for heterogeneous many-core platforms
Copperhead: Compiling an embedded data parallel language
Coprocessor Computing with FPGA and GPU
CoreTSAR: Task Scheduling for Accelerator-aware Runtimes
Correctly rounding elementary functions on GPU
Correctly Treating Synchronizations in Compiling Fine-Grained SPMD-Threaded Programs for CPU
Correlating Radio Astronomy Signals with Many-Core Hardware
Correlation analysis on GPU systems using NVIDIA’s CUDA
Cortical architectures on a GPGPU
CosmoFlow: Using Deep Learning to Learn the Universe at Scale
Cosmological Calculations on the GPU
Cost Efficient PageRank Computation using GPU
Cost-aware function migration in heterogeneous systems
Cost-effective low-power graphics processing unit for handheld devices
Cost-effective medical image reconstruction: from clusters to graphics processing units
Cost-Effective Methodology for Complex Tuning Searches in HPC: Navigating Interdependencies and Dimensionality
Cost-Effective Soft-Error Protection for SRAM-Based Structures in GPGPUs
COTS cluster-based sort-last rendering: performance evaluation and pipelined implementation
Coulomb and Landau Gauge Fixing in GPUs using CUDA and MILC
Coulomb, Landau and Maximally Abelian Gauge Fixing in Lattice QCD with Multi-GPUs
Counting and Occurrence Sort for GPUs using an Embedded Language
Counting Triangles in Large Graphs on GPU
Coupled Vlasov and two-fluid codes on GPUs
Coupler Design and Optimization by GPU-Accelerated DG-FEM
Coupling a Generalized DEM and an SPH Models Under a Heterogeneous Massively Parallel Framework
Coupling between Meshless FEM Modeling and Rendering on GPU for Real-time Physically-based Volumetric Deformation
Coupling Lattice Boltzmann Gas and Level Set Method for Simulating Free Surface Flow in GPU/CUDA Environment
COVRA: A compression-domain output-sensitive volume rendering architecture based on a sparse representation of voxel blocks
COX: CUDA on X86 by Exposing Warp-Level Functions to CPUs
COX: Exposing CUDA Warp-Level Functions to CPUs
cphVB: A System for Automated Runtime Optimization and Parallelization of Vectorized Applications
Cpp-Taskflow: A General-purpose Parallel and Heterogeneous Task Programming System at Scale
CPPJoules: An Energy Measurement Tool for C++
CPU and GPU Co-processing for Sound
CPU and GPU Implementation of QCD by using OpenCL
CPU and/or GPU: Revisiting the GPU Vs. CPU Myth
CPU-GPU Algorithms for Triangular Surface Mesh Simplification
Titles: 100
open PDFs: 94
packages: 17