Papers on hgpu.org (.txt-file)
Consolidating Applications for Energy Efficiency in Heterogeneous Computing Systems
Constrained inverse volume rendering for planetary nebulae
Constructing Long Short-Term Memory based Deep Recurrent Neural Networks for Large Vocabulary Speech Recognition
Constructing Natural Neighbor Interpolation Based Grid DEM Using CUDA
Constructing Two-Dimensional Voronoi Diagrams via Divide-and-Conquer of Envelopes in Space
Constructing Two-Dimensional Voronoi Diagrams via Divide-and-Conquer of Envelopes in Space (thesis)
Construction and Implementation of a Simple Agent-Based System on GPU-Architectures
Construction and Rendering of Trimmed Blending Surfaces with Sharp Features on a GPU
Construction of a Virtual Cluster by Integrating PCI Pass-Through for GPU and InfiniBand Virtualization in Cloud
Construction of Efficient Kd-Trees for Static Scenes Using Voxel-visibility Heuristic
Content Based Image Retrieval with Graphical Processing Unit
Context Parallelism for Scalable Million-Token Inference
Context-aware volume navigation
Continual surface-based multi-projector blending for moving objects
Continuous Level of Detail on Graphics Hardware
Continuous Representation of Projected Attribute Spaces of Multifields over Any Spatial Sampling
Contour-based algorithm for vectorization of satellite images
Contouring for Power Systems Using Graphical Processing Units
Contract-Based General-Purpose GPU Programming
Contributions of hybrid architectures to depth imaging: a CPU, APU and GPU comparative study
Contributions to Music Semantic Analysis and Its Acceleration Techniques
Contributions to Parallel Simulation of Equation-Based Models on Graphics Processing Units
Contributions to parallel stochastic simulation: Application of good software engineering practices to the distribution of pseudorandom streams in hybrid Monte-Carlo simulations
Contributions to the Efficient Use of General Purpose Coprocessors: Kernel Density Estimation as Case Study
Convergence and Scalarization for Data-Parallel Architectures
Converting Data to Task-Parallelism by Rewrites
Converting Data-Parallelism to Task-Parallelism by Rewrites: Purely Functional Programs Across Multiple GPUs
Convex Clustering: An Attractive Alternative to Hierarchical Clustering
Convolution of large 3D images on GPU and its decomposition
Convolutional Neural Network for Sentence Classification
Convolutional Neural Network-Based Image Representation for Visual Loop Closure Detection
Convolutional Neural Networks for Human Activity Recognition using Mobile Sensors
Convolutional Neural Networks for Large-Scale Bird Song Classification in Noisy Environment
COOK Access Control on an embedded Volta GPU
Cooperative CPU, GPU, and FPGA heterogeneous execution with EngineCL
Cooperative Heterogeneous Computing for Parallel Processing on CPU/GPU Hybrids
Cooperative Kernels: GPU Multitasking for Blocking Algorithms
Cooperative Multitasking for GPU-Accelerated Grid Systems
Coordinate strip-mining and kernel fusion to lower power consumption on GPU
Coordinated system level resource management for heterogeneous many-core platforms
Copperhead: Compiling an embedded data parallel language
Coprocessor Computing with FPGA and GPU
CoreTSAR: Task Scheduling for Accelerator-aware Runtimes
Correctly rounding elementary functions on GPU
Correctly Treating Synchronizations in Compiling Fine-Grained SPMD-Threaded Programs for CPU
Correlating Radio Astronomy Signals with Many-Core Hardware
Correlation analysis on GPU systems using NVIDIA’s CUDA
Cortical architectures on a GPGPU
CosmoFlow: Using Deep Learning to Learn the Universe at Scale
Cosmological Calculations on the GPU
Cost Efficient PageRank Computation using GPU
Cost-aware function migration in heterogeneous systems
Cost-effective low-power graphics processing unit for handheld devices
Cost-effective medical image reconstruction: from clusters to graphics processing units
Cost-Effective Methodology for Complex Tuning Searches in HPC: Navigating Interdependencies and Dimensionality
Cost-Effective Soft-Error Protection for SRAM-Based Structures in GPGPUs
COTS cluster-based sort-last rendering: performance evaluation and pipelined implementation
Coulomb and Landau Gauge Fixing in GPUs using CUDA and MILC
Coulomb, Landau and Maximally Abelian Gauge Fixing in Lattice QCD with Multi-GPUs
Counting and Occurrence Sort for GPUs using an Embedded Language
Counting Triangles in Large Graphs on GPU
Coupled Vlasov and two-fluid codes on GPUs
Coupler Design and Optimization by GPU-Accelerated DG-FEM
Coupling a Generalized DEM and an SPH Models Under a Heterogeneous Massively Parallel Framework
Coupling between Meshless FEM Modeling and Rendering on GPU for Real-time Physically-based Volumetric Deformation
Coupling Lattice Boltzmann Gas and Level Set Method for Simulating Free Surface Flow in GPU/CUDA Environment
COVRA: A compression-domain output-sensitive volume rendering architecture based on a sparse representation of voxel blocks
COX: CUDA on X86 by Exposing Warp-Level Functions to CPUs
COX: Exposing CUDA Warp-Level Functions to CPUs
cphVB: A System for Automated Runtime Optimization and Parallelization of Vectorized Applications
Cpp-Taskflow: A General-purpose Parallel and Heterogeneous Task Programming System at Scale
CPU and GPU Co-processing for Sound
CPU and GPU Implementation of QCD by using OpenCL
CPU and/or GPU: Revisiting the GPU Vs. CPU Myth
CPU-GPU Algorithms for Triangular Surface Mesh Simplification
CPU-GPU Collaboration for Output Quality Monitoring
CPU-GPU hybrid accelerating the Zuker algorithm for RNA secondary structure prediction application
CPU-GPU Hybrid Parallel Binomial American Option Pricing
CPU-GPU Layer-Switched Low Latency CNN Inference
CPU, GPU and FPGA Implementations of MALD: Ceramic Tile Surface Defects Detection Algorithm
CPU, SMP and GPU implementations of Nohalo level 1, a fast co-convex antialiasing image resampler
CPU/GPGPU/HW comparison of an Eigenfaces face recognition system
CPU/GPU Code Acceleration on Heterogeneous Systems and Code Verification for CFD Applications
CPU/GPU computing for long-wave radiation physics on large GPU clusters
CPUless PCs inside networked control systems
CRAC: Checkpoint-Restart Architecture for CUDA with Streams and UVM
Crack-free rendering of dynamically tesselated B-Rep models
Cracks in the Sky: Abelian-Higgs Cosmic String Evolution with CUDA
Cramming: Training a Language Model on a Single GPU in One Day
Crane – Fast and Migratable GPU Passthrough for OpenCL applications
Creating a Dataset for High-Performance Computing Code Translation using LLMs: A Bridge Between OpenMP Fortran and C+
Creating a Dataset Supporting Translation Between OpenMP Fortran and C++ Code
Creating HW/SW co-designed MPSoPC’s from high level programming models
Creating Optimal Code for GPU-Accelerated CT Reconstruction Using Ant Colony Optimization
Creation and control of rain in virtual environments
CRINK: Automatic CUDA code generation for affine C programs
Critical Comparison of the Classification Ability of Deep Convolutional Neural Network Frameworks with Support Vector Machine Techniques in the Image Classification Process
Titles: 100
open PDFs: 93
packages: 18