Papers on hgpu.org (.txt-file)
A Study of Single and Multi-device Synchronization Methods in Nvidia GPUs

A Study of Successive Over-relaxation Method Parallelization Over Modern HPC Languages

A Study of the Parallelization of Hybrid SAT Solver using CUDA

A Study of the Potential of Locality-Aware Thread Scheduling for GPUs

A study of the speed and the accuracy of the Boundary Element Method as applied to the computational simulation of biological organs

A Study of Time and Energy Efficient Algorithms for Parallel and Heterogeneous Computing

A Study on Efficient Application Mapping on Parallel Computing Accelerators

A Study on GPU Computing and Accelerating Simulation of Sedimentary Rock Structure

A Study on Neural-based Code Summarization in Low-resource Settings

A Study on Parallel Imaging Algorithm of 3D Geological Data
A study on tetrahedron-based inhomogeneous Monte Carlo optical simulation

A Study on the Acceleration of Arrival Curve Construction and Regular Specification Mining using GPUs

A Study on the Intersection of GPU Utilization and CNN Inference

A Summary of Recent GPU Developments and Key Enabling Technologies for Digital Media Applications
A Superresolution Framework for High-Accuracy Multiview Reconstruction

A Survey Of Architectural Approaches for Data Compression in Cache and Main Memory Systems

A Survey Of Architectural Approaches for Managing Embedded DRAM and Non-volatile On-chip Caches

A Survey of Architectural Techniques For DRAM Power Management

A Survey of Architectural Techniques For Improving Cache Power Efficiency

A Survey Of Architectural Techniques for Managing Process Variation

A Survey Of Architectural Techniques for Near-Threshold Computing

A Survey of Big Data, High Performance Computing, and Machine Learning Benchmarks

A survey of BRDF models for computer graphics
A Survey of Cache Bypassing Techniques

A Survey of Cache Partitioning Techniques for Multicore Processors

A Survey of Cloud Lighting and Rendering Techniques

A Survey of Cloud-Based GPU Threats and Their Impact on AI, HPC, and Cloud Computing

A Survey of Convolutional Neural Networks on Edge with Reconfigurable Computing

A Survey of CPU-GPU Heterogeneous Computing Techniques

A Survey of CUDA-based Multidimensional Scaling on GPU Architecture

A Survey of Deep Learning Library Testing Methods

A Survey of FPGA Based Deep Learning Accelerators: Challenges and Opportunities

A Survey of FPGA Based Neural Network Accelerator

A Survey of FPGA-based Accelerators for Convolutional Neural Networks

A Survey of General-Purpose Computation on Graphics Hardware

A Survey of General-purpose Polyhedral Compilers

A survey of GPU-based medical image computing techniques

A Survey of Machine Learning for Computer Architecture and Systems

A survey of medical image registration on graphics hardware
A Survey of Medical Image Registration on Multicore and the GPU

A Survey of Methods For Analyzing and Improving GPU Energy Efficiency

A Survey of Neural Computation on Graphics Processing Hardware

A survey of point-based techniques in computer graphics

A Survey of Power Management Techniques for Phase Change Memory

A Survey of Recent Prefetching Techniques for Processor Caches

A Survey of ReRAM-based Architectures for Processing-in-memory and Neural Networks

A Survey of Soft-Error Mitigation Techniques for Non-Volatile Memories

A Survey of Software Techniques for Using Non-Volatile Memories for Storage and Main Memory Systems

A survey of sparse matrix-vector multiplication performance on large matrices

A Survey of System Architectures and Techniques for FPGA Virtualization

A Survey Of Techniques for Approximate Computing

A Survey Of Techniques for Architecting and Managing Asymmetric Multicore Processors

A Survey of Techniques for Architecting and Managing GPU Register File

A Survey Of Techniques for Architecting DRAM Caches

A Survey of Techniques for Architecting Processor Components using Domain Wall Memory

A Survey of Techniques for Architecting SLC/MLC/TLC Hybrid Flash Memory based SSDs

A Survey of Techniques for Architecting TLBs

A Survey Of Techniques for Cache Locking

A Survey of Techniques for Designing and Managing CPU Register File

A Survey of Techniques for Dynamic Branch Prediction

A Survey of Techniques For Improving Energy Efficiency in Embedded Computing Systems

A Survey of Techniques for Improving Security of GPUs

A Survey of Techniques for Improving Security of Non-volatile Memories

A Survey Of Techniques for Managing and Leveraging Caches in GPUs

A Survey of Techniques for Modeling and Improving Reliability of Computing Systems

A Survey on Agent-based Simulation using Hardware Accelerators

A Survey on Compiler Autotuning using Machine Learning

A Survey on Design Methodologies for Accelerating Deep Learning on Heterogeneous Architectures

A Survey on Evaluating and Optimizing Performance of Intel Xeon Phi

A survey on FPGA-based accelerator for ML models

A Survey on GPU System Considering its Performance on Different Applications

A survey on graphic processing unit computing for large-scale data mining

A Survey on Hardware Accelerators for Large Language Models

A Survey on Optimization Techniques for Edge Artificial Intelligence (AI)

A Survey on Optimized Implementation of Deep Learning Models on the NVIDIA Jetson Platform

A Survey On Parallelization Of Data Mining Techniques

A survey on various computationally intensive parallel applications in High performance Computing System with OpenCL-MPI

A Survey Paper on Solving TSP using Ant Colony Optimization on GPU

A Switched Dynamical System Framework for Analysis of Massively Parallel Asynchronous Numerical Algorithms

A Symbolic Emulator for Shuffle Synthesis on the NVIDIA PTX Code

A symbolic verifier for CUDA programs

A Synchronization-Free Algorithm for Parallel Sparse Triangular Solves

A System for Capturing, Rendering and Multiplexing Images on Multi-view Autostereoscopic Display
A Systematic Literature Survey of Sparse Matrix-Vector Multiplication

A systematic performance study of the parallel programming framework SkePU 3 using HPC-benchmarks

A Systematic Survey of General Sparse Matrix-Matrix Multiplication

A Task Parallel Algorithm for Computing the Costs of All-Pairs Shortest Paths on the CUDA-Compatible GPU

A Task-centric Memory Model for Scalable Accelerator Architectures

A task-driven implementation of a simple numerical solver for hyperbolic conservation laws

A taxonomy of accelerator architectures and their programming models
A TBB-CUDA Implementation for Background Removal in a video-based Fire Detection System

A Template Metaprogramming Approach to Support Parallel Programs for Multicores

A Tensor Compiler for Unified Machine Learning Prediction Serving

A Test Drive of the NVIDIA Jetson TX1 Developer Kit for Deep Learning and Computer Vision Applications

A tile-based parallel Viterbi algorithm for biological sequence alignment on GPU with CUDA

A Time Optimal Parallel Algorithm for the Dynamic Programming on the Hierarchical Memory Machine

A time-energy performance analysis of MapReduce on heterogeneous systems with GPUs

A Tool for Automatic Suggestions for Irregular GPU Kernel Optimization

A Tool for Automatically Suggesting Source-Code Optimizations for Complex GPU Kernels

A Tool for Interactive Parallelization

Titles: 100
open PDFs: 94
packages: 9
