Views of posts on hgpu.org
Constraint Fluids on GPU 1,347 views
Programming hybrid systems with implicit memory based synchronization 1,347 views
Challenges and opportunities of obtaining performance from multi-core CPUs and many-core GPUs 1,347 views
High-Level Support for Pipeline Parallelism on Many-Core Architectures 1,347 views
Programming Massively Parallel Architectures using MARTE: a Case Study 1,347 views
Scientific Programming for Heterogeneous Systems – Bridging the Gap between Algorithms and Applications 1,347 views
Photon mapping on programmable graphics hardware 1,347 views
Parallel Distributed Breadth First Search on the Kepler Architecture 1,347 views
GPU Based Massive Parallel Kawasaki Kinetics In Monte Carlo Modelling of Lipid Microdomains 1,346 views
Sequence Data Indexing Method Exploiting the Parallel Processing Resources of GPGPU 1,346 views
A GPGPU solution of the FMM near interactions for acoustic scattering problems 1,346 views
A GPU framework for parallel segmentation of volumetric images using discrete deformable models 1,346 views
Scaling Soft Matter Physics to Thousands of GPUs in Parallel 1,346 views
Many-core parallel computing – Can compilers and tools do the heavy lifting? 1,346 views
SOL: Effortless Device Support for AI Frameworks without Source Code Changes 1,346 views
Beam Dynamics Simulations with a GPU-accelerated Version of ELEGANT 1,346 views
An Embedded Stream Processor Core Based on Logarithmic Arithmetic for a Low-Power 3-D Graphics SoC 1,346 views
The ANTAREX Approach to Autotuning and Adaptivity for Energy Efficient HPC Systems 1,346 views
Techniques to maximize memory bandwidth on the Rigel compute accelerator 1,346 views
PNG1 triangles for tangent plane continuous surfaces on the GPU 1,346 views
Experimental B+-tree for GPU 1,346 views
86 PFLOPS Deep Potential Molecular Dynamics simulation of 100 million atoms with ab initio accuracy 1,346 views
Inline Vector Compression for Computational Physics 1,346 views
Full Speed Ahead: 3D Spatial Database Acceleration with GPUs 1,346 views
Fast Query for Exemplar-Based Image Completion 1,345 views
LDetector: A Low Overhead Race Detector For GPU Programs 1,345 views
Targeting heterogeneous architectures via macro data flow 1,345 views
Fast Disk Encryption through GPGPU Acceleration 1,345 views
Faster Upper Body Pose Estimation Using CUDA 1,345 views
Speeding up Mutual Information Computation Using NVIDIA CUDA Hardware 1,345 views
Text2Gestures: A Transformer-Based Network for Generating Emotive Body Gestures for Virtual Agents 1,345 views
Large neighborhood local search optimization on graphics processing units 1,345 views
Fitting multi-planet transit models to photometric time-data series by evolution strategies 1,345 views
Solving the Flexible Job Shop Problem on Multi-GPU 1,344 views
FDTD calculations using graphical processing units 1,344 views
Implementing and Evaluating Candidate-Based Invariant Generation 1,344 views
Interlanguages and synchronic models of computation 1,344 views
Heterogeneous parallel computing for image registration and linear algebra applications 1,344 views
Multi-GPU performance optimization of a computational fluid dynamics code using OpenACC 1,344 views
Suitability of NVIDIA GPUs for SKA1-Low 1,344 views
A general tridiagonal solver for coprocessors: Adapting g-Spike for the Intel Xeon Phi 1,344 views
Out-of-core Training for Extremely Large-Scale Neural Networks With Adaptive Window-Based Scheduling 1,344 views
Random number generators for massively parallel simulations on GPU 1,344 views
Performance and accuracy of Lattice-Boltzmann kernels on multi- and manycore architectures 1,344 views
GPU-based tolerance volumes for mesh processing 1,344 views
Simulation of atmospheric binary mixtures based on two-fluid model 1,344 views
Reducing Thread Divergence in GPU-based B&B Applied to the Flow-shop problem 1,343 views
Adaptive algebraic multigrid on SIMD architectures 1,343 views
Improving processing time for visual measurements of displacements of IPMC actuators using CUDA 1,343 views
GPU-Based Fast Minimum Spanning Tree Using Data Parallel Primitives 1,343 views
RoadRunner: a fast and flexible exoplanet transit model 1,343 views
ALPyNA: Acceleration of Loops in Python for Novel Architectures 1,343 views
GPU-Based Heuristic Solver for Linear Sum Assignment Problems Under Real-time Constraints 1,343 views
Hybrid OpenCL over high speed networks 1,343 views
GPU-accelerated power pattern synthesis of aperiodic linear arrays 1,343 views
An Embedding Method for Interactive Simulation on Dynamic Surfaces 1,342 views
LUDA: Boost LSM Key Value Store Compactions with GPUs 1,342 views
Platform Characterization for Domain-Specific Computing 1,342 views
Whole-function vectorization 1,342 views
Fast continuous collision detection among deformable models using graphics processors 1,342 views
Parallelization of Single Threaded Applications using OpenMP and CUDA/C 1,342 views
Low-overhead diskless checkpoint for hybrid computing systems 1,342 views
Efficient Parallel Implementation of Molecular Dynamics with Embedded Atom Method on Multi-core Platforms 1,342 views
The Test and Evaluation Uses of Heterogeneous Computing: GPGPUs and Other Approaches 1,341 views
On the efficiency of iterative ordered subset reconstruction algorithms for acceleration on GPUs 1,341 views
Using Graphics Processors for a High Performance Normalization of Gene Expressions 1,341 views
A GPU-based architecture for improved online rebinning performance in clinical 3-D PET 1,341 views
Location-based Matching in Publish/Subscribe Revisited 1,341 views
The optimization of parallel Smith-Waterman sequence alignment using on-chip memory of GPGPU 1,341 views
On the design of architecture-aware algorithms for emerging applications 1,341 views
Characterizing Deep Learning Training Workloads on Alibaba-PAI 1,341 views
Investigating Input Representations and Representation Models of Source Code for Machine Learning 1,341 views
Compilation for Heterogeneous Computing: Automating Analyses, Transformations and Decisions 1,341 views
Hierarchical DAG Scheduling for Hybrid Distributed Systems 1,341 views
A new physics engine with automatic process distribution between CPU-GPU 1,340 views
Visibility Sampling on GPU and Applications 1,340 views
Auto-tuning of fast fourier transform on graphics processors 1,340 views
Decoupled Vector-Fetch Architecture with a Scalarizing Compiler 1,340 views
Vector graphics depicting marbling flow 1,340 views
Top ten ways to make formal methods for HPC practical 1,340 views
Towards microsecond biological molecular dynamics simulations on hybrid processors 1,340 views
A structured parallel periodic arnoldi shooting algorithm for RF-PSS analysis based on GPU platforms 1,340 views
Real-time rendering of large-scale tree scene 1,340 views
Real-Time Simulation of Granular Materials Using Graphics Hardware 1,339 views
Parallel preconditioning for spherical harmonics expansions of the Boltzmann transport equation 1,339 views
GPU-based asynchronous particle swarm optimization 1,339 views
GPU-based real-time small displacement estimation with ultrasound 1,339 views
Compressed Facade Displacement Maps 1,339 views
Understanding the Performance of HPC Applications 1,339 views
A Fast 3D Spatial Analysis Technique Using Graphic Process Units 1,339 views
Using GPUs to Accelerate Installed Antenna Performance Simulations 1,339 views
Raising the level of many-core programming with compiler technology: meeting a grand challenge 1,339 views
Discrete fourier transform on multicore 1,339 views
Many-threaded implementation of differential evolution for the CUDA platform 1,338 views
Titles: 100
Total views: 134312
- Programming - 186,129 views
- Login - 164,359 views
- User dashboard - 90,594 views
- Paper titles list - 70,004 views
- Add new event - 64,579 views
- Add new post - 59,320 views
- Register - 49,175 views
- Statistics - 36,468 views
- Modification of self-organizing migration algorithm for OpenCL framework - 34,165 views
- Books on OpenCL and CUDA - 28,811 views