Papers on hgpu.org (.txt-file)
CUDA Application Design and Development

CUDA au Coq: A Framework for Machine-validating GPU Assembly Programs

CUDA Based CAMshift Algorithm for Object Tracking Systems

CUDA Based Enhanced Differential Evolution: a Computational Analysis

CUDA Based Fast Implementation of Very Large Matrix Computation
CUDA Based GPU Programming to Simulate 3D Tissue Deformation
CUDA based iterative methods for linear systems

CUDA Based Multi Objective Parallel Genetic Algorithms: Adapting Evolutionary Algorithms for Document Searches

CUDA Based Performance Evaluation of the Computational Efficiency of the DCT Image Compression Technique on Both the CPU and GPU

CUDA by Example: An Introduction to General-Purpose GPU Programming
CUDA Compatible GPU as an Efficient Hardware Accelerator for AES Cryptography

CUDA compatible GPU cards as efficient hardware accelerators for Smith-Waterman sequence alignment

CUDA cuts: Fast graph cuts on the GPU

CUDA Enhanced Filtering in a Pipelined Video Processing Framework

CUDA Enhanced Simulated Annealing for Chip Layout Problem
CUDA Fortran for Scientists and Engineers

CUDA Implementation in the EM Scattering of a Three-Layer Canopy

CUDA Implementation of ${rm TE}^{z}$-FDTD Solution of Maxwell’s Equations in Dispersive Media

CUDA Implementation of a Lattice Boltzmann Method and Code Optimization

CUDA Implementation of Parallel Algorithms for Animal Noseprint Identification

CUDA implementation of the algorithm for simulating the epidemic spreading over large networks

CUDA implementation of the solution of a system of linear equations arising in an hp-Finite Element code

CUDA implementation of Wagener’s 2D convex hull PRAM algorithm

Cuda K-Nn: application to the segmentation of the retinal vasculature within SD-OCT volumes of mice

CUDA Kernel Design for GPU-Based Beam Dymanics Simulations

CUDA Kernel Design for GPU-Based Beam Dynamics Simulations

CUDA Leaks: Information Leakage in GPU Architectures

CUDA Memory Optimizations for Large Data-Structures in the Gravit Simulator
CUDA method for the FDTD simulation by GPU

CUDA optimization strategies for compute- and memory-bound neuroimaging algorithms

CUDA Parallel Algorithms for Forward and Inverse Structural Gravity Problems

CUDA Programming: A Developer’s Guide to Parallel Computing with GPUs

CUDA programs for GPU computing of Swendsen-Wang multi-cluster spin flip algorithm: 2D and 3D Ising, Potts, and XY models

CUDA raytracing algorithm for visualizing discrete element model output

CUDA simulations of active dumbbell suspensions

CUDA Tutorial – Cryptanalysis of Classical Ciphers Using Modern GPUs and CUDA

CUDA-Accelerated Data-Mining for Putative Heteromeric Transcription Factors and Target Genes Using Microarray Gene Expression Profiles

CUDA-Accelerated Geodesic Ray-Tracing for Fiber Tracking

CUDA-Accelerated HD-ODETLAP: Lossy High Dimensional Gridded Data Compression

CUDA-accelerated Hierarchical K-means

CUDA-Accelerated ODETLAP: A Parallel Lossy Compression Implementation

CUDA-API-wrappers: Thin C++-flavored wrappers for the CUDA runtime API

CUDA-based acceleration and algorithm refinement for volume image registration
CUDA-based AES parallelization with fine-tuned GPU memory utilization
CUDA-based GPU Implementation of Hierarchical Belief Propagation for Fast Stereo Matching

CUDA-Based Jacobi’s Iterative Method
CUDA-based real time surgery simulation

CUDA-based Signed Distance Field Calculation for Adaptive Grids
CUDA-BLASTP: Accelerating BLASTP on CUDA-Enabled Graphics Hardware

CUDA-C implementation of the ADER-DG method for linear hyperbolic PDEs

CUDA-enabled LBM Flow Simulation around Three Equilateral Cylinders using GPU Computing Processor

CUDA-enabled Optimisation of Technical Analysis Parameters

cuda-kat: The CUDA Kernel Author’s Toolkit

CUDA-level performance with python-level productivity for Gaussian mixture model applications

CUDA-Lite: Reducing GPU programming complexity

CUDA-LLM: LLMs Can Write Efficient CUDA Kernels

CUDA-OpenGL Interoperability to Visualize Electromagnetic Fields Calculated by FDTD

CUDA-Zero: a framework for porting shared memory GPU applications to multi-GPUs

CUDA: Scalable parallel programming for high-performance scientific computing

cudaBayesreg: Parallel Implementation of a Bayesian Multilevel Model for fMRI Data Analysis

CudaChain: A Practical GPU-accelerated 2D Convex Hull Algorithm

CUDACL: A tool for CUDA and OpenCL programmers

CUDACLAW: a Data Parallel Solution Framework for Hyperbolic PDEs

CUDACS: securing the cloud with CUDA-enabled secure virtualization

CudaDMA: Optimizing GPU Memory Bandwidth via Warp Specialization

CUDAEASY – a GPU Accelerated Cosmological Lattice Program

CudaForge: An Agent Framework with Hardware Feedback for CUDA Kernel Optimization

CudaGIS: Report on the Design and Realization of a Massive Data Parallel GIS on GPUs

Cudagrind: A Valgrind Extension for CUDA

CudaHull: Fast Parallel 3D Convex Hull on the GPU

CUDAICA: GPU optimization of Infomax-ICA EEG analysis

CUDAlign: using GPU to accelerate the comparison of megabase genomic sequences
cudaMap: a GPU accelerated program for gene expression connectivity mapping

CudaRF: A CUDA-based Implementation of Random Forests

CUDASA: Compute Unified Device and Systems Architecture

CUDASW++ 2.0: enhanced Smith-Waterman protein database search on CUDA-enabled GPUs based on SIMT and virtualized SIMD abstractions

CUDASW++ 3.0: accelerating Smith-Waterman protein database search by coupling CPU and GPU SIMD instructions

CUDASW++: optimizing Smith-Waterman sequence database searches for CUDA-enabled graphics processing units

cuDNN: Efficient Primitives for Deep Learning

CUDT: A CUDA Based Decision Tree Algorithm

Cue-independent extending inverse kinematics for robust pose estimation in 3D point clouds

CUED-RNNLM – An Open-Source Toolkit for Efficient Training and Evaluation of Recurrent Neural Network Language Models

cufftShift: High Performance CUDA-accelerated FFT-shift Library

cuFINUFFT: a load-balanced GPU library for general-purpose nonuniform FFTs

CUgrep: A GPU-based high performance multi-string matching system
cuGWAM: Genome-wide association multifactor dimensionality reduction using CUDA-enabled high-performance graphics processing unit

CuHMMer: A load-balanced CPU-GPU cooperative bioinformatics application

cuIBM — A GPU-accelerated Immersed Boundary Method

cuInspiral: prototype gravitational waves detection pipeline fully coded on GPU using CUDA

CUKNN: A parallel implementation of K-nearest neighbor on CUDA-enabled GPU

CULA: hybrid GPU accelerated linear algebra routines

CuLDA_CGS: Solving Large-scale LDA Problems on GPUs

cuLGT: Lattice Gauge Fixing on GPUs

CULLIDE: interactive collision detection between complex models in large environments using graphics hardware

Titles: 100
open PDFs: 84
packages: 26
