Papers on hgpu.org (.txt-file)
Performance Analysis of GPU compared to Single-core and Multi-core CPU for Natural Language Applications
Performance Analysis of GPU-Accelerated Filter-Based Source Finding for HI Spectral Line Image Data
Performance Analysis of GPU-based SAR and Interferometric SAR image processing
Performance Analysis of IBM Cell Broadband Engine on Sequence Alignment
Performance Analysis of Join Algorithms on GPUs
Performance Analysis of kNN on large datasets using CUDA & Pthreads
Performance analysis of matrix-free conjugate gradient kernels using SYCL
Performance analysis of memory transfers and GEMM subroutines on NVIDIA Tesla GPU cluster
Performance analysis of multi-core CPUs and GPU computing on SF-FDTD scheme for third order nonlinear materials and periodic media
Performance Analysis of Open Source Machine Learning Frameworks for Various Parameters in Single-Threaded and Multi-Threaded Modes
Performance analysis of parallel gravitational N-body codes on large GPU cluster
Performance Analysis of Parallel Sorting Algorithms using GPU Computing
Performance Analysis of Roberts Edge Detection Using CUDA and OpenGL
Performance Analysis of Sobel Edge Detection Filter on GPU using CUDA & OpenGL
Performance Analysis of Sobel Edge Filter on Heterogeneous System Using OpenCL
Performance Analysis of Sparse Matrix-Vector Multiplication (SpMV) on Graphics Processing Units (GPUs)
Performance analysis of SSE instructions in multi-core CPUs and GPU computing on FDTD scheme for solid and fluid vibration problems
Performance Analysis of the OP2 Framework on Many-core Architectures
Performance Analysis on Energy Efficient High-Performance Architectures
Performance Analysis on Several GPU Architectures of an Algorithm for Noise Removal
Performance and accuracy of hardware-oriented native-, emulated- and mixed-precision solvers in FEM simulations
Performance and accuracy of hardware-oriented native-, emulated- and mixed-precision solvers in FEM simulations (Part 2: Double Precision GPUs)
Performance and accuracy of Lattice-Boltzmann kernels on multi- and manycore architectures
Performance and Efficiency Analysis of Modern Accelerators: Fine-Grained Parallelism on the Intel Xeon Phi
Performance and energy footprint assessment of FPGAs and GPUs on HPC systems using Astrophysics application
Performance and energy optimization of the iterative solution of sparse linear systems on multicore processors
Performance and numerical accuracy evaluation of heterogeneous multicore systems for Krylov orthogonal basis computation
Performance and Portability of Accelerated Lattice Boltzmann Applications with OpenACC
Performance and Power Analysis of ATI GPU: A Statistical Approach
Performance and Power Comparisons Between Fermi and Cypress GPUs
Performance and Power Comparisons Between Nvidia and ATI GPUs
Performance and Power Evaluation of AI Accelerators for Training Deep Learning Models
Performance and Power Optimization of GPU Architectures for General-purpose Computing
Performance and Productivity of Parallel Python Programming: A study with a CFD Test Case
Performance and Quality of Random Number Generators
Performance and scalability of Fourier domain optical coherence tomography acceleration using graphics processing units
Performance and Scalability of GPU-Based Convolutional Neural Networks
Performance Assessment of A Multi-block Incompressible Navier-Stokes Solver using Directive-based GPU Programming in a Cluster Environment
Performance assessment of CUDA and OpenACC in large scale combustion simulations
Performance Assessment of OpenMP Compilers Targeting NVIDIA V100 GPUs
Performance Assessment of using OpenCL on FPGA Systems for ODE Solvers
Performance Aware Convolutional Neural Network Channel Pruning for Embedded GPUs
Performance benchmarking of deep learning framework on Intel Xeon Phi
Performance Characterization and Optimization of Atomic Operations on AMD GPUs
Performance characterization of data-intensive kernels on AMD Fusion architectures
Performance Characterization of Multi-threaded Graph Processing Applications on Intel Many-Integrated-Core Architecture
Performance Comparison Between Cg-based and CUDA-based Matrix Multiplications
Performance Comparison for Neuroscience Application Benchmarks
Performance comparison of CFD-DEM solver MFiX-Exa, on GPUs and CPUs
Performance Comparison of Cholesky Decomposition on GPUs and FPGAs
Performance Comparison of Different OpenCL Implementations of LBM Simulation on Commodity Computer Hardware
Performance comparison of FPGA, GPU and CPU in image processing
Performance comparison of gauss-Jordan elimination method using OpenMP and CUDA
Performance comparison of GPU and FPGA architectures for the SVM training problem
Performance Comparison of GPU, DSP and FPGA implementations of image processing and computer vision algorithms in embedded systems
Performance Comparison of GPUs with a Genetic Algorithm based on CUDA
Performance Comparison of Graphics Processors to Reconfigurable Logic: A Case Study
Performance comparison of Lattice Boltzmann fluid flow simulation using OpenCL and CUDA frameworks
Performance comparison of single-precision SPICE Model-Evaluation on FPGA, GPU, Cell, and multi-core processors
Performance Comparison with OpenMP Parallelization for Multi-core Systems
Performance Considerations When Using a Dedicated Ray Traversal Engine
Performance Counters based Power Modeling of Mobile GPUs using Deep Learning
Performance Debugging Frameworks for FPGA High-Level Synthesis
Performance Debugging of GPGPU Applications with the Divergence Map
Performance Degradation Analysis of GPU Kernels
Performance Drawbacks for Matrix Multiplication using Set Associative Cache in GPU devices
Performance Efficient DNA Sequence Detection on GPU Using Parallel Pattern Matching Approach
Performance Engineering for a Medical Imaging Application on the Intel Xeon Phi Accelerator
Performance Engineering for a Tall & Skinny Matrix Multiplication Kernel on GPUs
Performance engineering for the Lattice Boltzmann method on GPGPUs: Architectural requirements and performance results
Performance Engineering of the Kernel Polynomial Method on Large-Scale CPU-GPU Systems
Performance enhancement of MAGIC FDTD-PIC plasma-wave simulations using GPU processing
Performance Evaluation and Analysis of Sparse Matrix and Graph Kernels on Heterogeneous Processors
Performance Evaluation and Improvements of the PoCL Open-Source OpenCL Implementation on Intel CPUs
Performance Evaluation and Optimization of HPCG benchmark on CPU + MIC platform
Performance evaluation and optimization of random memory access on multicores with high productivity
Performance Evaluation and Tuning of An OpenCL based Matrix Multiplier
Performance Evaluation of Advanced Features in CUDA Unified Memory
Performance Evaluation of Blocking and NonBlocking Concurrent Queues on GPUs
Performance Evaluation of Concurrent Lock-free Data Structures on GPUs
Performance Evaluation of Container-based Virtualization for High Performance Computing Environments
Performance Evaluation of CPU-GPU communication Depending on the Characteristic of Co-Located Workloads
Performance evaluation of CUDA programming for machining simulation
Performance evaluation of deep learning on smartphones
Performance Evaluation of Deep Learning Tools in Docker Containers
Performance Evaluation of Discrete Wavelet Transform Based on Image Compression Technique on Both CPU and GPU
Performance Evaluation of Edge Detection Techniques on GPU Using OpenCL
Performance Evaluation of Feature Extraction Algorithm on GPGPU
Performance evaluation of GPU memory hierarchy using the FFT
Performance evaluation of H.264/AVC decoding and visualization using the GPU
Performance Evaluation of Heterogeneous GPU Programming Frameworks for Hemodynamic Simulations
Performance evaluation of image processing algorithms on the GPU
Performance Evaluation of Intel Xeon Phi Coprocessor using XKaapi
Performance Evaluation of Mixed Precision Algorithms for Solving Sparse Linear Systems
Performance Evaluation of OpenMP’s Target Construct on GPUs – Exploring Compiler Optimizations
Performance Evaluation of OpenMP’s Target Construct on GPUs: Exploring Compiler Optimizations
Titles: 100
open PDFs: 92
packages: 10