Papers on hgpu.org (.txt-file)
FPGA based Speeded Up Robust Features

FPGA implementation of a Convolutional Neural Network for "Wake up word" detection

FPGA Implementation of Bluetooth Low Energy Physical Layer with OpenCL

FPGA Implementation of Reduced Precision Convolutional Neural Networks

FPGA in HPC: High Level Synthesys of OpenCL kernels for Molecular Dynamics

FPGA vs. GPU for sparse matrix vector multiply

FPGA vs. multi-core CPUs vs. GPUs: hands-on experience with a sorting application

FPGA-Accelerated Image Processing Using High Level Synthesis with OpenCL

FPGA-based acceleration of a particle simulation High Performance Computing application

FPGA-based acceleration of CHARMM-potential minimization

FPGA-based Acceleration of FT Convolution for Pulsar Search Using OpenCL

FPGA-Based Accelerator Design from a Domain-Specific Language

FPGA-Based Design of Numerical Algorithms for Kernel Density Estimation Using High Level Synthesis Approach

FPGA-based Tsunami Simulation: Performance Comparison with GPUs, and Roofline Model for Scalability Analysis

FPGA-GPU architecture for kernel SVM pedestrian detection

FPGA-GPU-CPU Heterogenous Architecture for Real-time Cardiac Physiological Optical Mapping

FPGA: An Efficient And Promising Platform For Real-Time Image Processing Applications

fpgaConvNet: A Toolflow for Mapping Diverse Convolutional Neural Networks on Embedded FPGAs

FPGAs, GPUs and the PS2 – A Single Programming Methodology

Fractal Art Generation using GPUs

Fractal Based Method on Hardware Acceleration for Natural Environments

Fractal Video Compression in OpenCL: An Evaluation of CPUs, GPUs, and FPGAs as Acceleration Platforms

Fractals Image Rendering and Compression using GPUs

Frame-based parallelization of MPEG-4 on compute unified device architecture (CUDA)

Framework for Batched and GPU-resident Factorization Algorithms Applied to Block Householder Transformations

Framework for Parallel Kernels Auto-tuning

Framework for utilizing computational devices within simulation

Frameworks for GPU Accelerators: A comprehensive evaluation using 2D/3D image registration

Frameworks for multi-core architectures: a comprehensive evaluation using 2D/3D image registration

Frameworks in Medical Image Analysis with Deep Neural Networks

Free Launch: Optimizing GPU Dynamic Kernel Launches through Thread Reuse

Free surface flow simulations on GPGPUs using the LBM

Free-form interest rate term structure decomposition: a 2nd order optimization problem

Frequent itemset mining on graphics processors

From Constraint Programming to Heterogeneous Parallelism

From CUDA to OpenCL: Towards a Performance-portable Solution for Multi-platform GPU Programming

From English To Foreign Languages: Transferring Pre-trained Language Models

From Experiment to Design – Fault Characterization and Detection in Parallel Computer Systems Using Computational Accelerators

From GPUs to AI and quantum: three waves of acceleration in bioinformatics

From MPI to MPI+OpenACC: Conversion of a legacy FORTRAN PCG solver for the spherical Laplace equation

From Parallel Programs to Customized Parallel Processors

From Physics Model to Results: An Optimizing Framework for Cross-Architecture Code Generation

From Pixels to Torques: Policy Learning using Deep Dynamical Convolutional Networks

From Rendering to Tracking Point-based 3D Models
From Sparse Matrix to Optimal GPU CUDA Sparse Matrix Vector Product Implementation
From Task-Based GPU Work Aggregation to Stellar Mergers: Turning Fine-Grained CPU Tasks into Portable GPU Kernels

FSCL: Homogeneous programming, scheduling and execution on heterogeneous platforms

FSimGP^2: An Efficient Fault Simulator with GPGPU
FSpGEMM: An OpenCL-based HPC Framework for Accelerating General Sparse Matrix-Matrix Multiplication on FPGAs

FTTN: Feature-Targeted Testing for Numerical Properties of NVIDIA & AMD Matrix Accelerators

Full Covariance Gaussian Mixture Models Evaluation on GPU

Full reconstruction of a 14-qubit state within four hours

Full Speed Ahead: 3D Spatial Database Acceleration with GPUs

Full system simulation of many-core heterogeneous SoCs using GPU and QEMU semihosting

Full-Parallax Hologram Synthesis of Triangular Meshes using a Graphical Processing Unit

Full-resolution interactive CPU volume rendering with coherent BVH traversal
Full-Scale File System Acceleration on GPU

Full-Speed Deterministic Bit-Accurate Parallel Floating-Point Summation on Multi- and Many-Core Architectures

Full-stack Optimization for Accelerating CNNs with FPGA Validation

Full-System Simulation of Mobile CPU/GPU Platforms

Fully 3-D List-Mode OSEM Accelerated by Graphics Processing Units

Fully 3D list-mode time-of-flight PET image reconstruction on GPUs using CUDA

Fully accelerating quantum Monte Carlo simulations of real materials on GPU clusters

Fully automatic extraction of salient objects from videos in near real-time

Fully Concurrent GPU Data Structures

Fully GPU based real time corrections and reconstruction for cone beam micro CT
Fully Parallel Particle Learning for GPGPUs and Other Parallel Devices

Fully-3D GPU PET reconstruction
Fully-Automated Code Generation for Efficient Computation of Sparse Matrix Permanents on GPUs

Function Call Re-Vectorization

Functional and dynamic programming in the design of parallel prefix networks

Functional High Performance Financial IT

Functional Programming for High-Performance Computing on Heterogeneous Architectures

Functional Signal Processing with Pure and Faust Using the LLVM Toolkit

Fusion of Morphological Images for Airborne Target Detection

FusionAccel: A General Re-configurable Deep Learning Inference Accelerator on FPGA for Convolutional Neural Networks

FusionSim: Characterizing the Performance Benefits of Fused CPU/GPU Systems

FusionStitching: Boosting Execution Efficiency of Memory Intensive Computations for DL Workloads

FusionStitching: Deep Fusion and Code Generation for Tensorflow Computations on GPUs

Future of GPGPU Micro-Architectural Parameters

FUX-Sim: Implementation of a fast universal simulation/reconstruction framework for X-ray systems

Fuzz4cuda: Fuzzing Your Nvidia Gpu Libraries Through Debug Interface

Fuzzing Loop Optimizations in Compilers for C++ and Data-Parallel Languages

Fuzzy ART Neural Network Parallel Computing on the GPU

FuzzyGPU: a fuzzy arithmetic library for GPU

FZ-GPU: A Fast and High-Ratio Lossy Compressor for Scientific Computing Applications on GPUs

G-CP: Providing Fault Tolerance on the GPU through Software Checkpointing

G-Heart: A GPU-based System for Electrophysiological Simulation and Multi-modality Cardiac Visualization

G-NET: Effective GPU Sharing in NFV Systems

G-NetMon: A GPU-accelerated Network Performance Monitoring System

G-NetMon: A GPU-accelerated Network Performance Monitoring System for Large Scale Scientific Collaborations

G-SNPM – A GPU-based SNP mapping tool

GA3C: GPU-based A3C for Deep Reinforcement Learning

GACO: A GPU-based High Performance Parallel Multi-ant Colony Optimization Algorithm

GaDei: On Scale-up Training As A Service For Deep Learning

Titles: 100
open PDFs: 93
packages: 15
