Papers on hgpu.org (.txt-file)
CUDA Tutorial – Cryptanalysis of Classical Ciphers Using Modern GPUs and CUDA
CUDA-Accelerated Data-Mining for Putative Heteromeric Transcription Factors and Target Genes Using Microarray Gene Expression Profiles
CUDA-Accelerated Geodesic Ray-Tracing for Fiber Tracking
CUDA-Accelerated HD-ODETLAP: Lossy High Dimensional Gridded Data Compression
CUDA-accelerated Hierarchical K-means
CUDA-Accelerated ODETLAP: A Parallel Lossy Compression Implementation
CUDA-API-wrappers: Thin C++-flavored wrappers for the CUDA runtime API
CUDA-based acceleration and algorithm refinement for volume image registration
CUDA-based AES parallelization with fine-tuned GPU memory utilization
CUDA-based GPU Implementation of Hierarchical Belief Propagation for Fast Stereo Matching
CUDA-Based Jacobi’s Iterative Method
CUDA-based real time surgery simulation
CUDA-based Signed Distance Field Calculation for Adaptive Grids
CUDA-BLASTP: Accelerating BLASTP on CUDA-Enabled Graphics Hardware
CUDA-C implementation of the ADER-DG method for linear hyperbolic PDEs
CUDA-enabled LBM Flow Simulation around Three Equilateral Cylinders using GPU Computing Processor
CUDA-enabled Optimisation of Technical Analysis Parameters
cuda-kat: The CUDA Kernel Author’s Toolkit
CUDA-level performance with python-level productivity for Gaussian mixture model applications
CUDA-Lite: Reducing GPU programming complexity
CUDA-OpenGL Interoperability to Visualize Electromagnetic Fields Calculated by FDTD
CUDA-Zero: a framework for porting shared memory GPU applications to multi-GPUs
CUDA: Scalable parallel programming for high-performance scientific computing
cudaBayesreg: Parallel Implementation of a Bayesian Multilevel Model for fMRI Data Analysis
CudaChain: A Practical GPU-accelerated 2D Convex Hull Algorithm
CUDACL: A tool for CUDA and OpenCL programmers
CUDACLAW: a Data Parallel Solution Framework for Hyperbolic PDEs
CUDACS: securing the cloud with CUDA-enabled secure virtualization
CudaDMA: Optimizing GPU Memory Bandwidth via Warp Specialization
CUDAEASY – a GPU Accelerated Cosmological Lattice Program
CudaGIS: Report on the Design and Realization of a Massive Data Parallel GIS on GPUs
Cudagrind: A Valgrind Extension for CUDA
CudaHull: Fast Parallel 3D Convex Hull on the GPU
CUDAICA: GPU optimization of Infomax-ICA EEG analysis
CUDAlign: using GPU to accelerate the comparison of megabase genomic sequences
cudaMap: a GPU accelerated program for gene expression connectivity mapping
CudaRF: A CUDA-based Implementation of Random Forests
CUDASA: Compute Unified Device and Systems Architecture
CUDASW++ 2.0: enhanced Smith-Waterman protein database search on CUDA-enabled GPUs based on SIMT and virtualized SIMD abstractions
CUDASW++ 3.0: accelerating Smith-Waterman protein database search by coupling CPU and GPU SIMD instructions
CUDASW++: optimizing Smith-Waterman sequence database searches for CUDA-enabled graphics processing units
cuDNN: Efficient Primitives for Deep Learning
CUDT: A CUDA Based Decision Tree Algorithm
Cue-independent extending inverse kinematics for robust pose estimation in 3D point clouds
CUED-RNNLM – An Open-Source Toolkit for Efficient Training and Evaluation of Recurrent Neural Network Language Models
cufftShift: High Performance CUDA-accelerated FFT-shift Library
cuFINUFFT: a load-balanced GPU library for general-purpose nonuniform FFTs
CUgrep: A GPU-based high performance multi-string matching system
cuGWAM: Genome-wide association multifactor dimensionality reduction using CUDA-enabled high-performance graphics processing unit
CuHMMer: A load-balanced CPU-GPU cooperative bioinformatics application
cuIBM — A GPU-accelerated Immersed Boundary Method
cuInspiral: prototype gravitational waves detection pipeline fully coded on GPU using CUDA
CUKNN: A parallel implementation of K-nearest neighbor on CUDA-enabled GPU
CULA: hybrid GPU accelerated linear algebra routines
CuLDA_CGS: Solving Large-scale LDA Problems on GPUs
cuLGT: Lattice Gauge Fixing on GPUs
CULLIDE: interactive collision detection between complex models in large environments using graphics hardware
CuMAPz: a tool to analyze memory access patterns in CUDA
CuMF_SGD: Fast and Scalable Matrix Factorization
CuMF: scale matrix factorization using just ONE machine with GPUs
CuNesl: Compiling Nested Data-Parallel Languages for SIMT Architectures
CuNeuQuant: A CUDA Implementation of the NeuQuant Image Quantization Algorithm
CuParcone A High-Performance Evolvable Neural Network Model
CuPBoP-AMD: Extending CUDA to AMD Platforms
CuPBoP: CUDA for Parallelized and Broad-range Processors
CuPBoP: Making CUDA a Portable Language
cuPC: CUDA-based Parallel PC Algorithm for Causal Structure Learning on GPU
cuPentBatch – A batched pentadiagonal solver for NVIDIA GPUs
CuPP – A framework for easy CUDA integration
cuPSO: GPU Parallelization for Particle Swarm Optimization Algorithms
CURFIL: Random Forests for Image Labeling on GPU
Curling and clumping fur represented by texture layers
Curracurrong: a stream processing system for distributed environments
Current and Nascent SETI Instruments in the Radio and Optical
CUSA and CUDE: GPU-accelerated methods for estimating solvent accessible surface area and desolvation
cusFFT: A High-Performance Sparse Fast Fourier Transform Algorithm on GPUs
CUSHAW: a CUDA compatible short read aligner to large genomes based on the Burrows-Wheeler transform
CUSIMANN: An optimized simulated annealing software for GPUs
cuSLINK: Single-linkage Agglomerative Clustering on the GPU
cuSten – CUDA Finite Difference and Stencil Library
Custom Code Generation for a Graph DSL
Customizable Domain-Specific Computing
Customizable Memory Schemes for Data Parallel Accelerators
Customization of OpenCL Applications for Efficient Task Mapping under Heterogeneous Platform Constraints
Customizing Driving Directions with GPUs
Customizing Instruction Set Extensible Reconfigurable Processors using GPUs
cuSZ-I: High-Fidelity Error-Bounded Lossy Compression for Scientific Data on GPUs
cuSZ(x): Optimizing Error-Bounded Lossy Compression for Scientific Data on GPUs
cuSZp2: A GPU Lossy Compressor with Extreme Throughput and Optimized Compression Ratio
CUTE solutions for two-point correlation functions from large cosmological datasets
cuTT: A High-Performance Tensor Transpose Library for CUDA Compatible GPUs
CUVLE: Variable-Length Encoding on CUDA
cuZK: Accelerating Zero-Knowledge Proof with A Faster Parallel Multi-Scalar Multiplication Algorithm on GPUs
CVC: The Contourlet Video Compression algorithm for real-time applications
CVPI: A Computer Vision Library For Mobile and Embedded Platforms
Cyclic Reduction Tridiagonal Solvers on GPUs Applied to Mixed-Precision Multigrid
Titles: 100
open PDFs: 87
packages: 38