Papers on hgpu.org (.txt-file)
Fast-Coding Robust Motion Estimation Model in a GPU
Fast-Fourier-Transform-Based Electrical Noise Measurements
Fast, Accurate and Shift-Varying Line Projections for Iterative Reconstruction Using the GPU
Fast, large volume, GPU enabled simulations for the Ly-alpha forest: power spectrum forecasts for baryon acoustic oscillation experiments
Fast, Memory-Efficient Construction of Voxelized Shadows
Fast, parallel and secure cryptography algorithm using Lorenz’s attractor
Fast, parallel implementation of particle filtering on the GPU architecture
Fast, parallel, GPU-based construction of space filling curves and octrees
Fast, Processor-Cardinality Agnostic PRNG with a Tracking Application
Fast, Realistic Terrain Synthesis
FAST: fast architecture sensitive tree search on modern CPUs and GPUs
FastCollect: Offloading Generational Garbage Collection to Integrated GPUs
Faster across the PCIe bus: A GPU library for lightweight decompression
Faster Algorithms for RNA-folding using the Four-Russians method
Faster and Cheaper: Parallelizing Large-Scale Matrix Factorization on GPUs
Faster Dark Matter Calculations Using the GPU
Faster File Matching using GPGPUs
Faster GPU Based Genetic Programming Using A Two Dimensional Stack
Faster GPU-based convolutional gridding via thread coarsening
Faster Maliciously Secure Two-Party Computation Using the GPU
Faster matrix-vector multiplication on GeForce 8800GTX
Faster Multipattern Matching System on GPU Based on Aho-Corasick Algorithm
Faster Multiple Pattern Matching System on GPU based on Bit-Parallelism
Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks
Faster Radix Sort via Virtual Memory and Write-Combining
Faster sequence alignment through GPU-accelerated restriction of the seed-and-extend search space
Faster than FAST: GPU-Accelerated Frontend for High-Speed VIO
Faster Upper Body Pose Estimation and Recognition Using CUDA
Faster Upper Body Pose Estimation Using CUDA
FastFold: Reducing AlphaFold Training Time from 11 Days to 67 Hours
fastHOG – a real-time GPU implementation of HOG
FastMag: Fast micromagnetic simulator for complex magnetic structures
Fastplay: A Parallelization Model and Implementation of SMC on CUDA Based GPU Cluster Architecture
Fastrack: Fast IO for Secure ML using GPU TEEs
FastSpMM: An Efficient Library for Sparse Matrix Matrix Product on GPUs
FastSVC: Fast Cross-Domain Singing Voice Conversion with Feature-wise Linear Modulation
FastTree: A Hardware KD-Tree Construction Acceleration Engine for Real-Time Ray Tracing
Fat versus Thin Threading Approach on GPUs: Application to Stochastic Simulation of Chemical Reactions
Fat vs. Thin Threading Approach on GPUs: Application to Stochastic Simulation of Chemical Reactions
FATSEA-An Architectural Simulator for General Purpose Computing on GPUs
Fault Injection techniques for GPU Reliability Evaluation
Fault Table Computation on GPUs
Fault table generation using Graphics Processing Units
Fault Tree Analysis Speed-up with GPU Parallel Computing
FBLAS: Streaming Linear Algebra Kernels on FPGA
FBLAS: Streaming Linear Algebra on FPGA
FC_ACCEL: Enabling Efficient, Low-Latency and Flexible Inference in DNN Fully Connected Layers, using Optimized Checkerboard Block matrix decomposition, fast scheduling, and a resource efficient 1D PE array with a custom HBM2 memory subsystem
FCBench: Cross-Domain Benchmarking of Lossless Compression for Floating-Point Data
FCUDA: Enabling Efficient Compilation of CUDA Kernels onto FPGAs
FDTD calculations using graphical processing units
FDTD on Distributed Heterogeneous Multi-GPU Systems
Feasibility Analysis of Bilateral Filtering by General Purpose Graphical Processing Unit Computing
Feasibility Analysis of Low Cost Graphical Processing Units for Electromagnetic Field Simulations by Finite Difference Time Domain Method
FEAST – Realisation of hardware-oriented Numerics for HPC simulations with Finite Elements
Feature Aligned Volume Manipulation for Illustration and Visualization
Feature based terrain generation using diffusion equation
Feature Extraction and Visualization from Higher-Order CFD Data
Feature Generation for Quantification of Visual Similarity
Feature tracking and matching in video using programmable graphics hardware
Feature Tracking in Time-Varying Volumetric Data through Scale Invariant Feature Transform
Feature-based speed limit sign detection using a graphics processing unit
Feature-preserving triangular geometry images for level-of-detail representation of static and skinned meshes
FeCaffe: FPGA-enabled Caffe with OpenCL for Deep Learning Training and Inference on Intel Stratix 10
FELARE: Fair Scheduling of Machine Learning Applications on Heterogeneous Edge Systems
Ferrofluid Simulations with the Barnes-Hut Algorithm on Graphics Processing Units
Feynman Machine: The Universal Dynamical Systems Computer
FFT and Convolution Performance in Image Filtering on GPU
FFT Implementation on a Streaming Architecture
FFT Parallel Implementation for MRI Image Reconstruction
FFT-SPA Non-Binary LDPC Decoding on GPU
FIELA: A Fast Image Encryption with Lorenz Attractor using Hybrid Computing
Field modelling acceleration on ultrasonic systems using graphic hardware
FIESTA 4: optimized Feynman integral calculations with GPU support
FIKIT: Priority-Based Real-time GPU Multi-tasking Scheduling with Kernel Identification
File I/O on Intel Xeon Phi Coprocessors: RAM disks, VirtIO, NFS and Lustre
Filtered Blending: A new, minimal Reconstruction Filter for Ghosting-Free Projective Texturing with Multiple Images
Final Project Implementing Extremely Randomized Trees in CUDA
Financial Derivatives Modeling Using GPU’s
Financial modeling on the cell broadband engine
Finding Convex Hulls Using Quickhull on the GPU
Finding faint HI structure in and around galaxies: scraping the barrel
Finding Longest Common Subsequences by GPU-Based Parallel Ant Colony Optimization
Finding Next Best Views for Autonomous UAV Mapping through GPU-Accelerated Particle Simulation
Finding the Force – Consistent Particle Seeding for Satellite Aerodynamics
Finding, Measuring, and Reducing Inefficiencies in Contemporary Computer Systems
Fine-Grain Acceleration of Graph Algorithms on a Heterogeneous Chip
Fine-grain Parallelism using Multi-core, Cell/BE, and GPU Systems
Fine-grain Parallelism Using Multi-core, Cell/BE, and GPU Systems: Accelerating the Phylogenetic Likelihood Function
Fine-grain Task Aggregation and Coordination on GPUs
Fine-grained Parallel ILU Preconditioners with Fill-ins for Multi-core CPUs and GPUs
Fine-Grained Parallel Incomplete LU Factorization
Fine-grained parallelization of a Vlasov-Poisson application on GPU
Fine-Grained Resource Sharing for Concurrent GPGPU Kernels
Fine-Grained Synchronizations and Dataflow Programming on GPUs
Fine-Grained Treatment to Synchronizations in GPU-to-CPU Translation
Fine-Granular Parallel EBCOT and Optimization with CUDA for Digital Cinema Image Compression
Fine-sorting One-dimensional Particle-In-Cell Algorithm with Monte-Carlo Collisions on a Graphics Processing Unit
Fine-Tuning Vectorization and Memory Traffic on Intel Xeon Phi Coprocessors: LU Decomposition of Small Matrices
Fingerprint grid enhancement on GPU
Titles: 100
open PDFs: 93
packages: 18