Papers on hgpu.org (.txt-file)
Fault Injection techniques for GPU Reliability Evaluation
Fault Table Computation on GPUs
Fault table generation using Graphics Processing Units
Fault Tree Analysis Speed-up with GPU Parallel Computing
FBLAS: Streaming Linear Algebra Kernels on FPGA
FBLAS: Streaming Linear Algebra on FPGA
FC_ACCEL: Enabling Efficient, Low-Latency and Flexible Inference in DNN Fully Connected Layers, using Optimized Checkerboard Block matrix decomposition, fast scheduling, and a resource efficient 1D PE array with a custom HBM2 memory subsystem
FCBench: Cross-Domain Benchmarking of Lossless Compression for Floating-Point Data
FCUDA: Enabling Efficient Compilation of CUDA Kernels onto FPGAs
FDTD calculations using graphical processing units
FDTD on Distributed Heterogeneous Multi-GPU Systems
Feasibility Analysis of Bilateral Filtering by General Purpose Graphical Processing Unit Computing
Feasibility Analysis of Low Cost Graphical Processing Units for Electromagnetic Field Simulations by Finite Difference Time Domain Method
FEAST – Realisation of hardware-oriented Numerics for HPC simulations with Finite Elements
Feature Aligned Volume Manipulation for Illustration and Visualization
Feature based terrain generation using diffusion equation
Feature Extraction and Visualization from Higher-Order CFD Data
Feature Generation for Quantification of Visual Similarity
Feature tracking and matching in video using programmable graphics hardware
Feature Tracking in Time-Varying Volumetric Data through Scale Invariant Feature Transform
Feature-based speed limit sign detection using a graphics processing unit
Feature-preserving triangular geometry images for level-of-detail representation of static and skinned meshes
FeCaffe: FPGA-enabled Caffe with OpenCL for Deep Learning Training and Inference on Intel Stratix 10
FELARE: Fair Scheduling of Machine Learning Applications on Heterogeneous Edge Systems
Ferrofluid Simulations with the Barnes-Hut Algorithm on Graphics Processing Units
Feynman Machine: The Universal Dynamical Systems Computer
FFT and Convolution Performance in Image Filtering on GPU
FFT Implementation on a Streaming Architecture
FFT Parallel Implementation for MRI Image Reconstruction
FFT-SPA Non-Binary LDPC Decoding on GPU
FIELA: A Fast Image Encryption with Lorenz Attractor using Hybrid Computing
Field modelling acceleration on ultrasonic systems using graphic hardware
FIESTA 4: optimized Feynman integral calculations with GPU support
FIKIT: Priority-Based Real-time GPU Multi-tasking Scheduling with Kernel Identification
File I/O on Intel Xeon Phi Coprocessors: RAM disks, VirtIO, NFS and Lustre
Filtered Blending: A new, minimal Reconstruction Filter for Ghosting-Free Projective Texturing with Multiple Images
Final Project Implementing Extremely Randomized Trees in CUDA
Financial Derivatives Modeling Using GPU’s
Financial modeling on the cell broadband engine
Finding Convex Hulls Using Quickhull on the GPU
Finding faint HI structure in and around galaxies: scraping the barrel
Finding Longest Common Subsequences by GPU-Based Parallel Ant Colony Optimization
Finding Missed Code Size Optimizations in Compilers using LLMs
Finding Next Best Views for Autonomous UAV Mapping through GPU-Accelerated Particle Simulation
Finding the Force – Consistent Particle Seeding for Satellite Aerodynamics
Finding, Measuring, and Reducing Inefficiencies in Contemporary Computer Systems
Fine-Grain Acceleration of Graph Algorithms on a Heterogeneous Chip
Fine-grain Parallelism using Multi-core, Cell/BE, and GPU Systems
Fine-grain Parallelism Using Multi-core, Cell/BE, and GPU Systems: Accelerating the Phylogenetic Likelihood Function
Fine-grain Task Aggregation and Coordination on GPUs
Fine-grained Parallel ILU Preconditioners with Fill-ins for Multi-core CPUs and GPUs
Fine-Grained Parallel Incomplete LU Factorization
Fine-grained parallelization of a Vlasov-Poisson application on GPU
Fine-Grained Resource Sharing for Concurrent GPGPU Kernels
Fine-Grained Synchronizations and Dataflow Programming on GPUs
Fine-Grained Treatment to Synchronizations in GPU-to-CPU Translation
Fine-Granular Parallel EBCOT and Optimization with CUDA for Digital Cinema Image Compression
Fine-sorting One-dimensional Particle-In-Cell Algorithm with Monte-Carlo Collisions on a Graphics Processing Unit
Fine-Tuning Vectorization and Memory Traffic on Intel Xeon Phi Coprocessors: LU Decomposition of Small Matrices
Fingerprint grid enhancement on GPU
Fingerprint Local Invariant Feature Extraction on GPU with CUDA
Finite Difference Time Domain (FDTD) Simulations Using Graphics Processors
Finite Difference Time-Domain Modelling of Metamaterials: GPU Implementation of Cylindrical Cloak
Finite differences numerical method for two-dimensional superlattice Boltzmann transport equation and case comparison of CPU(C) and GPGPU(CUDA) implementations
Finite element assembly strategies on multi-and many-core architectures
Finite Element Integration on GPUs
Finite Element Integration with Quadrature on the GPU
Finite Element Matrix Generation on a GPU
Finite Element Modelling of Prostate Deformation and Needle-Tissue Interactions
Finite element numerical integration for first order approximations on multi-core architectures
Finite Element Numerical Integration on Xeon Phi coprocessor
Finite Pointset Method for 2D Dam-Break Problem with GPU-Acceleration
Finite temperature lattice QCD with GPUs
Finite-difference time-domain simulations of metamaterials
Finite-difference time-domain solver for room acoustics using graphics processing units
Finite-size scaling method for the Berezinskii-Kosterlitz-Thouless transition
FIR filtering and AES encryption with OpenCL 2.0
Fireflies: New software for interactively exploring dynamical systems using GPU computing
Fireiron: A Scheduling Language for High-Performance Linear Algebra on GPUs
Firepile: Run-time Compilation for GPUs in Scala
First Application of Lattice QCD to Pezy-SC Processor
First Evaluation of the CPU, GPGPU and MIC Architectures for Real Time Particle Tracking based on Hough Transform at the LHC
First Experiences Optimizing Smith-Waterman on Intel’s Knights Landing Processor
First experiences with the Intel MIC architecture at LRZ
First Steps Towards More Numerical Reproducibility
Fitting multi-planet transit models to photometric time-data series by evolution strategies
Fixing Performance Bugs: An Empirical Study of Open-Source GPGPU Programs
FLASH: Fast All-to-All Communication in GPU Clusters
FLASH: Randomized Algorithms Accelerated over CPU-GPU for Ultra-High Dimensional Similarity Search
Flashlight: Enabling Innovation in Tools for Machine Learning
FlexGrip: A Soft GPGPU for FPGAs
Flexible FPGA design for FDTD using OpenCL
Flexible Hardware Mapping for Finite Element Simulations on Hybrid CPU / GPU Clusters
Flexible Linear Algebra Development and Scheduling with Cholesky Factorization
Flexible N-Way MIMO Detector on GPU
Flexible neuronal network simulation framework using code generation for NVidia CUDA
Flexible OpenCL accelerated disparity estimation for video communication applications
Titles: 100
open PDFs: 92
packages: 16