## Papers on hgpu.org (.txt-file)

CUDA Fortran for Scientists and Engineers

CUDA Implementation in the EM Scattering of a Three-Layer Canopy

CUDA Implementation of ${rm TE}^{z}$-FDTD Solution of Maxwell’s Equations in Dispersive Media

CUDA Implementation of a Lattice Boltzmann Method and Code Optimization

CUDA Implementation of Parallel Algorithms for Animal Noseprint Identification

CUDA implementation of the algorithm for simulating the epidemic spreading over large networks

CUDA implementation of the solution of a system of linear equations arising in an hp-Finite Element code

CUDA implementation of Wagener’s 2D convex hull PRAM algorithm

Cuda K-Nn: application to the segmentation of the retinal vasculature within SD-OCT volumes of mice

CUDA Kernel Design for GPU-Based Beam Dymanics Simulations

CUDA Kernel Design for GPU-Based Beam Dynamics Simulations

CUDA Leaks: Information Leakage in GPU Architectures

CUDA Memory Optimizations for Large Data-Structures in the Gravit Simulator

CUDA method for the FDTD simulation by GPU

CUDA optimization strategies for compute- and memory-bound neuroimaging algorithms

CUDA Parallel Algorithms for Forward and Inverse Structural Gravity Problems

CUDA Programming: A Developer’s Guide to Parallel Computing with GPUs

CUDA programs for GPU computing of Swendsen-Wang multi-cluster spin flip algorithm: 2D and 3D Ising, Potts, and XY models

CUDA raytracing algorithm for visualizing discrete element model output

CUDA simulations of active dumbbell suspensions

CUDA-Accelerated Data-Mining for Putative Heteromeric Transcription Factors and Target Genes Using Microarray Gene Expression Profiles

CUDA-Accelerated Geodesic Ray-Tracing for Fiber Tracking

CUDA-Accelerated HD-ODETLAP: Lossy High Dimensional Gridded Data Compression

CUDA-accelerated Hierarchical K-means

CUDA-Accelerated ODETLAP: A Parallel Lossy Compression Implementation

CUDA-API-wrappers: Thin C++-flavored wrappers for the CUDA runtime API

CUDA-based acceleration and algorithm refinement for volume image registration

CUDA-based AES parallelization with fine-tuned GPU memory utilization

CUDA-based GPU Implementation of Hierarchical Belief Propagation for Fast Stereo Matching

CUDA-Based Jacobi’s Iterative Method

CUDA-based real time surgery simulation

CUDA-based Signed Distance Field Calculation for Adaptive Grids

CUDA-BLASTP: Accelerating BLASTP on CUDA-Enabled Graphics Hardware

CUDA-C implementation of the ADER-DG method for linear hyperbolic PDEs

CUDA-enabled LBM Flow Simulation around Three Equilateral Cylinders using GPU Computing Processor

CUDA-enabled Optimisation of Technical Analysis Parameters

CUDA-level performance with python-level productivity for Gaussian mixture model applications

CUDA-Lite: Reducing GPU programming complexity

CUDA-OpenGL Interoperability to Visualize Electromagnetic Fields Calculated by FDTD

CUDA-Zero: a framework for porting shared memory GPU applications to multi-GPUs

CUDA: Scalable parallel programming for high-performance scientific computing

cudaBayesreg: Parallel Implementation of a Bayesian Multilevel Model for fMRI Data Analysis

CudaChain: A Practical GPU-accelerated 2D Convex Hull Algorithm

CUDACL: A tool for CUDA and OpenCL programmers

CUDACLAW: a Data Parallel Solution Framework for Hyperbolic PDEs

CUDACS: securing the cloud with CUDA-enabled secure virtualization

CudaDMA: Optimizing GPU Memory Bandwidth via Warp Specialization

CUDAEASY – a GPU Accelerated Cosmological Lattice Program

CudaGIS: Report on the Design and Realization of a Massive Data Parallel GIS on GPUs

Cudagrind: A Valgrind Extension for CUDA

CudaHull: Fast Parallel 3D Convex Hull on the GPU

CUDAICA: GPU optimization of Infomax-ICA EEG analysis

CUDAlign: using GPU to accelerate the comparison of megabase genomic sequences

cudaMap: a GPU accelerated program for gene expression connectivity mapping

CudaRF: A CUDA-based Implementation of Random Forests

CUDASA: Compute Unified Device and Systems Architecture

CUDASW++ 2.0: enhanced Smith-Waterman protein database search on CUDA-enabled GPUs based on SIMT and virtualized SIMD abstractions

CUDASW++ 3.0: accelerating Smith-Waterman protein database search by coupling CPU and GPU SIMD instructions

CUDASW++: optimizing Smith-Waterman sequence database searches for CUDA-enabled graphics processing units

cuDNN: Efficient Primitives for Deep Learning

CUDT: A CUDA Based Decision Tree Algorithm

Cue-independent extending inverse kinematics for robust pose estimation in 3D point clouds

CUED-RNNLM – An Open-Source Toolkit for Efficient Training and Evaluation of Recurrent Neural Network Language Models

cufftShift: High Performance CUDA-accelerated FFT-shift Library

CUgrep: A GPU-based high performance multi-string matching system

cuGWAM: Genome-wide association multifactor dimensionality reduction using CUDA-enabled high-performance graphics processing unit

CuHMMer: A load-balanced CPU-GPU cooperative bioinformatics application

cuIBM — A GPU-accelerated Immersed Boundary Method

cuInspiral: prototype gravitational waves detection pipeline fully coded on GPU using CUDA

CUKNN: A parallel implementation of K-nearest neighbor on CUDA-enabled GPU

CULA: hybrid GPU accelerated linear algebra routines

cuLGT: Lattice Gauge Fixing on GPUs

CULLIDE: interactive collision detection between complex models in large environments using graphics hardware

CuMAPz: a tool to analyze memory access patterns in CUDA

CuMF_SGD: Fast and Scalable Matrix Factorization

CuMF: scale matrix factorization using just ONE machine with GPUs

CuNesl: Compiling Nested Data-Parallel Languages for SIMT Architectures

CuNeuQuant: A CUDA Implementation of the NeuQuant Image Quantization Algorithm

CuParcone A High-Performance Evolvable Neural Network Model

CuPP – A framework for easy CUDA integration

CURFIL: Random Forests for Image Labeling on GPU

Curling and clumping fur represented by texture layers

Curracurrong: a stream processing system for distributed environments

Current and Nascent SETI Instruments in the Radio and Optical

CUSA and CUDE: GPU-accelerated methods for estimating solvent accessible surface area and desolvation

cusFFT: A High-Performance Sparse Fast Fourier Transform Algorithm on GPUs

CUSHAW: a CUDA compatible short read aligner to large genomes based on the Burrows-Wheeler transform

CUSIMANN: An optimized simulated annealing software for GPUs

Customizable Domain-Specific Computing

Customizable Memory Schemes for Data Parallel Accelerators

Customization of OpenCL Applications for Efficient Task Mapping under Heterogeneous Platform Constraints

Customizing Driving Directions with GPUs

Customizing Instruction Set Extensible Reconfigurable Processors using GPUs

CUTE solutions for two-point correlation functions from large cosmological datasets

Titles: 100

open PDFs: 86

packages: 26