Papers on hgpu.org (.txt-file)
Fluid Simulation and Generating Textures with Reaction-Diffusion Systems on Surfaces in the GPU
Fluid Simulation by the Smoothed Particle Hydrodynamics Method: A Survey
Fluid Simulation on Surfaces in the GPU
Fluid simulation with SIMPLE method using graphic processors
Fluid Simulation: Smoothed Particle Hydrodynamics on the GPU
Fluid-solid coupling on a cluster of GPU graphics cards for seismic wave propagation
FluidFFT: common API (C++ and Python) for Fast Fourier Transform HPC libraries
FluoroSim: A Visual Problem-Solving Environment for Fluorescence Microscopy
Flux tubes at Finite Temperature
FMM-based vortex method for simulation of isotropic turbulence on GPUs, compared with a spectral method
fMRI analysis on the GPU-possibilities and challenges
Focus measurement on programmable graphics hardware for all in-focus rendering from light fields
Focused Volumetric Visual Hull with Color Extraction
Forecasting high frequency financial time series using parallel FFN with CUDA and ZeroMQ
Forecasting time series with constraints
Forensics on GPU Coprocessing in Databases – Research Challenges, First Experiments, and Countermeasures
Formal Analysis of GPU Programs with Atomics via Conflict-Directed Delay-Bounding
Formal Description and Optimization Based High – Performance Computing on CUDA
Formal Semantics of Heterogeneous CUDA-C: A Modular Approach with Applications
Formal specification and verification of OpenCL Kernel optimization
Formalizing Address Spaces with application to Cuda, OpenCL, and beyond
ForOpenCL: Transformations Exploiting Array Syntax in Fortran for Accelerator Programming
Fortran High-Level Synthesis: Reducing the barriers to accelerating HPC codes on FPGAs
Fortran performance optimisation and auto-parallelisation by leveraging MLIR-based domain specific abstractions in Flang
FortranX: Harnessing Code Generation, Portability, and Heterogeneity in Fortran
Four styles of parallel and net programming
Four-dimensional Cone Beam CT Reconstruction and Enhancement using a Temporal Non-Local Means Method
Fourier Volume Rendering on the GPU Using a Split-Stream-FFT
FPGA accelerated 3D reconstruction using compressive sensing
FPGA Accelerated Simulation of Biologically Plausible Spiking Neural Networks
FPGA Acceleration of Multifunction Printer Image Processing using OpenCL
FPGA acceleration of rigid-molecule docking codes
FPGA Acceleration of Structured-Mesh-Based Explicit and Implicit Numerical Solvers using SYCL
FPGA acceleration of the phylogenetic likelihood function for Bayesian MCMC inference methods
FPGA Accelerators on Heterogeneous Systems: An Approach Using High Level Synthesis
FPGA and GPU implementation of large scale SpMV
FPGA Based Acceleration of Decimal Operations
FPGA Based High Performance and Scalable Block LU Decomposition Architecture
FPGA Based Implementation of Deep Neural Networks Using On-chip Memory Only
FPGA Based Satisfiability Checking
FPGA based Speeded Up Robust Features
FPGA implementation of a Convolutional Neural Network for "Wake up word" detection
FPGA Implementation of Bluetooth Low Energy Physical Layer with OpenCL
FPGA Implementation of Reduced Precision Convolutional Neural Networks
FPGA in HPC: High Level Synthesys of OpenCL kernels for Molecular Dynamics
FPGA vs. GPU for sparse matrix vector multiply
FPGA vs. multi-core CPUs vs. GPUs: hands-on experience with a sorting application
FPGA-Accelerated Image Processing Using High Level Synthesis with OpenCL
FPGA-based acceleration of a particle simulation High Performance Computing application
FPGA-based acceleration of CHARMM-potential minimization
FPGA-based Acceleration of FT Convolution for Pulsar Search Using OpenCL
FPGA-Based Accelerator Design from a Domain-Specific Language
FPGA-Based Design of Numerical Algorithms for Kernel Density Estimation Using High Level Synthesis Approach
FPGA-based Tsunami Simulation: Performance Comparison with GPUs, and Roofline Model for Scalability Analysis
FPGA-GPU architecture for kernel SVM pedestrian detection
FPGA-GPU-CPU Heterogenous Architecture for Real-time Cardiac Physiological Optical Mapping
FPGA: An Efficient And Promising Platform For Real-Time Image Processing Applications
fpgaConvNet: A Toolflow for Mapping Diverse Convolutional Neural Networks on Embedded FPGAs
FPGAs, GPUs and the PS2 – A Single Programming Methodology
Fractal Art Generation using GPUs
Fractal Based Method on Hardware Acceleration for Natural Environments
Fractal Video Compression in OpenCL: An Evaluation of CPUs, GPUs, and FPGAs as Acceleration Platforms
Fractals Image Rendering and Compression using GPUs
Frame-based parallelization of MPEG-4 on compute unified device architecture (CUDA)
Framework for Batched and GPU-resident Factorization Algorithms Applied to Block Householder Transformations
Framework for Parallel Kernels Auto-tuning
Framework for utilizing computational devices within simulation
Frameworks for GPU Accelerators: A comprehensive evaluation using 2D/3D image registration
Frameworks for multi-core architectures: a comprehensive evaluation using 2D/3D image registration
Frameworks in Medical Image Analysis with Deep Neural Networks
Free Launch: Optimizing GPU Dynamic Kernel Launches through Thread Reuse
Free surface flow simulations on GPGPUs using the LBM
Free-form interest rate term structure decomposition: a 2nd order optimization problem
Frequent itemset mining on graphics processors
From Constraint Programming to Heterogeneous Parallelism
From CUDA to OpenCL: Towards a Performance-portable Solution for Multi-platform GPU Programming
From English To Foreign Languages: Transferring Pre-trained Language Models
From Experiment to Design – Fault Characterization and Detection in Parallel Computer Systems Using Computational Accelerators
From GPUs to AI and quantum: three waves of acceleration in bioinformatics
From MPI to MPI+OpenACC: Conversion of a legacy FORTRAN PCG solver for the spherical Laplace equation
From Parallel Programs to Customized Parallel Processors
From Physics Model to Results: An Optimizing Framework for Cross-Architecture Code Generation
From Pixels to Torques: Policy Learning using Deep Dynamical Convolutional Networks
From Rendering to Tracking Point-based 3D Models
From Sparse Matrix to Optimal GPU CUDA Sparse Matrix Vector Product Implementation
From Task-Based GPU Work Aggregation to Stellar Mergers: Turning Fine-Grained CPU Tasks into Portable GPU Kernels
FSCL: Homogeneous programming, scheduling and execution on heterogeneous platforms
FSimGP^2: An Efficient Fault Simulator with GPGPU
FSpGEMM: An OpenCL-based HPC Framework for Accelerating General Sparse Matrix-Matrix Multiplication on FPGAs
FTTN: Feature-Targeted Testing for Numerical Properties of NVIDIA & AMD Matrix Accelerators
Full Covariance Gaussian Mixture Models Evaluation on GPU
Full reconstruction of a 14-qubit state within four hours
Full Speed Ahead: 3D Spatial Database Acceleration with GPUs
Full system simulation of many-core heterogeneous SoCs using GPU and QEMU semihosting
Full-Parallax Hologram Synthesis of Triangular Meshes using a Graphical Processing Unit
Full-resolution interactive CPU volume rendering with coherent BVH traversal
Full-Scale File System Acceleration on GPU
Titles: 100
open PDFs: 93
packages: 15