Papers on hgpu.org (.txt-file)
The Framework and Compilation Techniques for Directive-based GPU Cluster Programming
The Future in Mobile Multicore Computing
The Future of Accelerator Programming: Abstraction, Performance or Can We Have Both?
The GASPI API specification and its implementation GPI 2.0
The Geant4 Visualisation System – a multi-driver graphics system
The GeForce 6 series GPU architecture
The Genetic Convolutional Neural Network Model Based on Random Sample
The GENGA Code: Gravitational Encounters in N-body simulations with GPU Acceleration
The GPU as a high performance computational resource
The GPU as numerical simulation engine
The GPU Computing Revolution: From Multi-Core CPUs To Many-Core Graphics Processors
The GPU Enhanced Parallel Computing for Large Scale Data Clustering
The GPU enters computing’s mainstream
The GPU on biomedical image processing for color and phenotype analysis
The GPU on irregular computing: performance issues and contributions
The GPU on the simulation of cellular computing models
The GPU vs Phi Debate: Risk Analytics Using Many-Core Computing
The GPU-based High-performance Pattern-matching Algorithm for Intrusion Detection
The GPU-based Parallel Ant Colony System
The GPU-based String Matching System in Advanced AC Algorithm
The gputools package enables GPU computing in R
The GPUVerify Method: a Tutorial Overview
The Graphics Card as a Streaming Computer
The Graphics Processor as a Mathematical Coprocessor in MATLAB
The Heisenberg spin glass model on GPU: myths and actual facts
The Hierarchical Memory Machine Model for GPUs
The Hitchhiker’s Guide to Cross-Platform OpenCL Application Development
The impact of accelerator processors for high-throughput molecular modeling and simulation
The impact of diverse memory architectures on multicore consumer software: an industrial perspective from the video games domain
The Impact of GPU DVFS on the Energy and Performance of Deep Learning: an Empirical Study
The impact of GPU/Multicore in Signal Processing: a quantitative approach
The Implement of Common Beam Forming Using GPU
The implementation and optimization of Bitonic sort algorithm based on CUDA
The Implementation of a Real-Time Polyphase Filter
The implementation of Multi-Scale Retinex image enhancement algorithm based on GPU via CUDA
The Infrared behavior of SU(3) Nf=12 gauge theory -about the existence of conformal fixed point-
The integrated implementation of surgical simulations through modeling by means of imaging, comprehension, visualization, deformation, and collision detection in virtual environments
The International Exascale Software Project roadmap
The K-Anonymity Approach in Preserving the Privacy of E-Services that Implement Data Mining
The Lattice Boltzmann Equation Method for Complex Flows
The Lattice Boltzmann Simulation on Multi-GPU Systems
The lattice-Boltzmann method for simulating gaseous phenomena
The Linear Direct Sparse Solver on GPU for Bundle Adjustment Method
The Living Application: a Self-Organising System for Complex Grid Tasks
The magic volume lens: an interactive focus+context technique for volume rendering
The Memory Controller Wall: Benchmarking the Intel FPGA SDK for OpenCL Memory Interface
The method of improving performace of the GPU-accelerated 2D FDTD simulator
The Model of Computation of CUDA and its Formal Semantics
The MOPED framework: Object recognition and pose estimation for manipulation
The More We Share, The More We Have: Improving GPU performance through Register Sharing
The MOSIX Cluster Operating System for High-Performance Computing on Linux Clusters, Multi-Clusters, GPU Clusters and Clouds
The MOSIX Virtual OpenCL (VCL) Cluster Platform
The multi-GPU System with ExpEther
The Multi2Sim Simulation Framework: A CPU-GPU Model for Heterogeneous Computing
The multikernel: a new OS architecture for scalable multicore systems
The nonequispaced FFT on graphics processing units
The OoO VLIW JIT Compiler for GPU Inference
The Open MatSci ML Toolkit: A Flexible Framework for Machine Learning in Materials Science
The openip open source image processing library
The OpenMP Cluster Programming Model
The Optimization of Algorithms in the Process of Temporal Data Mining Using the Compute Unified Device Architecture
The optimization of parallel Smith-Waterman sequence alignment using on-chip memory of GPGPU
The orthorectified technology for UAV aerial remote sensing image based on the Programmable GPU
The Parallel Bayesian Toolbox for High-performance Bayesian Filtering in Metrology
The Parallel Processing Based on CUDA for Convolution Filter FDK Reconstruction of CT
The PEPPHER Approach to Programmability and Performance Portability for Heterogeneous many-core Architectures
The PEPPHER Composition Tool: Performance-Aware Dynamic Composition of Applications for GPU-based Systems
The Performance Analysis Based on Heterogeneous Parallel Processors for Anisotropic Diffusion Filters
The performances of R GPU implementations of the GMRES method
The Physics of Singular Dislocation Structures in Continuum Dislocation Dynamics
The Plasma Simulation Code: A modern particle-in-cell code with load-balancing and GPU support
The Possibility of Fast Large-Scale Numerical Simulation Implemented with Graphics Processing Units
The Potential for a GPU-Like Overlay Architecture for FPGAs
The Potential of the Intel Xeon Phi for Supervised Deep Learning
The Power-Performance Tradeoffs of the Intel Xeon Phi on HPC Applications
The Promises of Hybrid Hexagonal/Classical Tiling for GPU
The Q Continuum Simulation: Harnessing the Power of GPU Accelerated Supercomputers
The Reconstruction Toolkit (RTK), an open-source cone-beam CT reconstruction toolkit based on the Insight Toolkit (ITK)
The Reduction Problem in CUDA and Its Simulation with P Systems
The Research of Large-Scale 3D Scenes Rendering Optimization
The Research of Real-Time Shadow Rendering Algorithm of Virtual Scenes
The Rhombic Dodecahedron Map: An Efficient Scheme for Encoding Panoramic Video
The Risks of WebGL: Analysis, Evaluation and Detection
The Rodinia Benchmark Suite in SYCL
The role of GPU computing in medical image analysis and visualization
The role of multigrid algorithms for LQCD
The Saga of Landau-Gauge Propagators: Gathering New Ammo
The Scalable Heterogeneous Computing (SHOC) benchmark suite
The scoring sequences on profile Hidden Markov Models with delete states elimination by GPUs
The Security of Key Derivation Functions in WINRAR
The Sharing Tracker: Using Ideas from Cache Coherence Hardware to Reduce Off-Chip Memory Traffic with Non-Coherent Caches
The sparse matrix vector product on GPUs
The State of the Art in Interactive Global Illumination
The Stencil Processing Unit: GPGPU Done Right
The Study of the OpenCL Processing Models for the FPGA Devices
The system for visualization of synoptic objects
Titles: 100
open PDFs: 88
packages: 20