Papers on hgpu.org (.txt-file)
CUDA Leaks: Information Leakage in GPU Architectures
CUDA Memory Optimizations for Large Data-Structures in the Gravit Simulator
CUDA method for the FDTD simulation by GPU
CUDA optimization strategies for compute- and memory-bound neuroimaging algorithms
CUDA Parallel Algorithms for Forward and Inverse Structural Gravity Problems
CUDA Programming: A Developer’s Guide to Parallel Computing with GPUs
CUDA programs for GPU computing of Swendsen-Wang multi-cluster spin flip algorithm: 2D and 3D Ising, Potts, and XY models
CUDA raytracing algorithm for visualizing discrete element model output
CUDA simulations of active dumbbell suspensions
CUDA Tutorial – Cryptanalysis of Classical Ciphers Using Modern GPUs and CUDA
CUDA-Accelerated Data-Mining for Putative Heteromeric Transcription Factors and Target Genes Using Microarray Gene Expression Profiles
CUDA-Accelerated Geodesic Ray-Tracing for Fiber Tracking
CUDA-Accelerated HD-ODETLAP: Lossy High Dimensional Gridded Data Compression
CUDA-accelerated Hierarchical K-means
CUDA-Accelerated ODETLAP: A Parallel Lossy Compression Implementation
CUDA-API-wrappers: Thin C++-flavored wrappers for the CUDA runtime API
CUDA-based acceleration and algorithm refinement for volume image registration
CUDA-based AES parallelization with fine-tuned GPU memory utilization
CUDA-based GPU Implementation of Hierarchical Belief Propagation for Fast Stereo Matching
CUDA-Based Jacobi’s Iterative Method
CUDA-based real time surgery simulation
CUDA-based Signed Distance Field Calculation for Adaptive Grids
CUDA-BLASTP: Accelerating BLASTP on CUDA-Enabled Graphics Hardware
CUDA-C implementation of the ADER-DG method for linear hyperbolic PDEs
CUDA-enabled LBM Flow Simulation around Three Equilateral Cylinders using GPU Computing Processor
CUDA-enabled Optimisation of Technical Analysis Parameters
cuda-kat: The CUDA Kernel Author’s Toolkit
CUDA-level performance with python-level productivity for Gaussian mixture model applications
CUDA-Lite: Reducing GPU programming complexity
CUDA-LLM: LLMs Can Write Efficient CUDA Kernels
CUDA-OpenGL Interoperability to Visualize Electromagnetic Fields Calculated by FDTD
CUDA-Zero: a framework for porting shared memory GPU applications to multi-GPUs
CUDA: Scalable parallel programming for high-performance scientific computing
cudaBayesreg: Parallel Implementation of a Bayesian Multilevel Model for fMRI Data Analysis
CudaChain: A Practical GPU-accelerated 2D Convex Hull Algorithm
CUDACL: A tool for CUDA and OpenCL programmers
CUDACLAW: a Data Parallel Solution Framework for Hyperbolic PDEs
CUDACS: securing the cloud with CUDA-enabled secure virtualization
CudaDMA: Optimizing GPU Memory Bandwidth via Warp Specialization
CUDAEASY – a GPU Accelerated Cosmological Lattice Program
CudaGIS: Report on the Design and Realization of a Massive Data Parallel GIS on GPUs
Cudagrind: A Valgrind Extension for CUDA
CudaHull: Fast Parallel 3D Convex Hull on the GPU
CUDAICA: GPU optimization of Infomax-ICA EEG analysis
CUDAlign: using GPU to accelerate the comparison of megabase genomic sequences
cudaMap: a GPU accelerated program for gene expression connectivity mapping
CudaRF: A CUDA-based Implementation of Random Forests
CUDASA: Compute Unified Device and Systems Architecture
CUDASW++ 2.0: enhanced Smith-Waterman protein database search on CUDA-enabled GPUs based on SIMT and virtualized SIMD abstractions
CUDASW++ 3.0: accelerating Smith-Waterman protein database search by coupling CPU and GPU SIMD instructions
CUDASW++: optimizing Smith-Waterman sequence database searches for CUDA-enabled graphics processing units
cuDNN: Efficient Primitives for Deep Learning
CUDT: A CUDA Based Decision Tree Algorithm
Cue-independent extending inverse kinematics for robust pose estimation in 3D point clouds
CUED-RNNLM – An Open-Source Toolkit for Efficient Training and Evaluation of Recurrent Neural Network Language Models
cufftShift: High Performance CUDA-accelerated FFT-shift Library
cuFINUFFT: a load-balanced GPU library for general-purpose nonuniform FFTs
CUgrep: A GPU-based high performance multi-string matching system
cuGWAM: Genome-wide association multifactor dimensionality reduction using CUDA-enabled high-performance graphics processing unit
CuHMMer: A load-balanced CPU-GPU cooperative bioinformatics application
cuIBM — A GPU-accelerated Immersed Boundary Method
cuInspiral: prototype gravitational waves detection pipeline fully coded on GPU using CUDA
CUKNN: A parallel implementation of K-nearest neighbor on CUDA-enabled GPU
CULA: hybrid GPU accelerated linear algebra routines
CuLDA_CGS: Solving Large-scale LDA Problems on GPUs
cuLGT: Lattice Gauge Fixing on GPUs
CULLIDE: interactive collision detection between complex models in large environments using graphics hardware
CuMAPz: a tool to analyze memory access patterns in CUDA
CuMF_SGD: Fast and Scalable Matrix Factorization
CuMF: scale matrix factorization using just ONE machine with GPUs
CuNesl: Compiling Nested Data-Parallel Languages for SIMT Architectures
CuNeuQuant: A CUDA Implementation of the NeuQuant Image Quantization Algorithm
CuParcone A High-Performance Evolvable Neural Network Model
CuPBoP-AMD: Extending CUDA to AMD Platforms
CuPBoP: CUDA for Parallelized and Broad-range Processors
CuPBoP: Making CUDA a Portable Language
cuPC: CUDA-based Parallel PC Algorithm for Causal Structure Learning on GPU
cuPentBatch – A batched pentadiagonal solver for NVIDIA GPUs
CuPP – A framework for easy CUDA integration
cuPSO: GPU Parallelization for Particle Swarm Optimization Algorithms
CURFIL: Random Forests for Image Labeling on GPU
Curling and clumping fur represented by texture layers
Curracurrong: a stream processing system for distributed environments
Current and Nascent SETI Instruments in the Radio and Optical
CUSA and CUDE: GPU-accelerated methods for estimating solvent accessible surface area and desolvation
cusFFT: A High-Performance Sparse Fast Fourier Transform Algorithm on GPUs
CUSHAW: a CUDA compatible short read aligner to large genomes based on the Burrows-Wheeler transform
CUSIMANN: An optimized simulated annealing software for GPUs
cuSLINK: Single-linkage Agglomerative Clustering on the GPU
cuSten – CUDA Finite Difference and Stencil Library
Custom Code Generation for a Graph DSL
Customizable Domain-Specific Computing
Customizable Memory Schemes for Data Parallel Accelerators
Customization of OpenCL Applications for Efficient Task Mapping under Heterogeneous Platform Constraints
Customizing Driving Directions with GPUs
Titles: 100
open PDFs: 86
packages: 34