2402

Views of posts on hgpu.org

A Foray into Efficient Mapping of Algorithms to Hardware Platforms on Heterogeneous Systems  1,649 views

Edge coloring in unstructured CFD codes  1,649 views

The CoRa Tensor Compiler: Compilation for Ragged Tensors with Minimal Padding  1,649 views

Targeting heterogeneous architectures via macro data flow  1,648 views

Dynamic scheduling Monte-Carlo framework for multi-accelerator heterogeneous clusters  1,648 views

Computation on programmable graphics hardware  1,648 views

A Real-Time Soft Shadow Rendering Algorithm by Occluder-Discretization  1,648 views

Nested Parallelism on GPU: Exploring Parallelization Templates for Irregular Loops and Recursive Computations  1,648 views

HetPipe: Enabling Large DNN Training on (Whimpy) Heterogeneous GPU Clusters through Integration of Pipelined Model Parallelism and Data Parallelism  1,648 views

Enabling and Scaling Matrix Computations on Heterogeneous Multi-Core and Multi-GPU Systems  1,648 views

Evaluating FPGA Accelerator Performance with a Parameterized OpenCL Adaptation of the HPCChallenge Benchmark Suite  1,647 views

Scaling Lattice QCD beyond 100 GPUs  1,647 views

Task superscalar: using processors as functional units  1,647 views

Performance modeling and automatic ghost zone optimization for iterative stencil loops on GPUs  1,647 views

High-Throughput Transaction Executions on Graphics Processors  1,647 views

Directive-Based Data Partitioning and Pipelining and Auto-Tuning for High-Performance GPU Computing  1,647 views

Keeneland: Bringing heterogeneous GPU computing to the computational science community  1,647 views

Generation of Kernels for Calculating Electron Repulsion Integrals of High Angular Momentum Functions on GPUs – Preliminary Results  1,647 views

Real-Time Simulation and Rendering of 3D Smoke on GPU Programme  1,646 views

On the technology roadmap of Free-Viewpoint 3DTV receivers  1,646 views

CL-VIS: Visualization Platform for Understanding and Checking the OpenCL Programs  1,646 views

Evaluation and tuning of the Level 3 CUBLAS for graphics processors  1,646 views

An efficient out-of-core volume rendering method based on ray casting and GPU acceleration  1,646 views

Scalable Multi-Cache Simulation Using GPUs  1,645 views

Dense Symmetric Indefinite Factorization on GPU Accelerated Architectures  1,645 views

Research for Chinese Spam Filtering Based on GPU  1,645 views

SYCLops: A SYCL Specific LLVM to MLIR Converter  1,644 views

StreamBrain: An HPC Framework for Brain-like Neural Networks on CPUs, GPUs and FPGAs  1,644 views

CUDA Kernel Design for GPU-Based Beam Dymanics Simulations  1,644 views

NUMA Data-Access Bandwidth Characterization and Modeling  1,644 views

Query-Driven Visualization of Time-Varying Adaptive Mesh Refinement Data  1,644 views

The Possibility of Fast Large-Scale Numerical Simulation Implemented with Graphics Processing Units  1,644 views

Innovative prospective of Antenna-Gain removing the pain of EMI engineers  1,644 views

Message passing on data-parallel architectures  1,643 views

Loop Transformation Recipes for Code Generation and Auto-Tuning  1,643 views

Aeolian Sand Movement and Interacting with Vegetation: A GPU Based Simulation and Visualization Method  1,643 views

DySel: Lightweight Dynamic Selection for Kernel-based Data-parallel Programming Model  1,643 views

GPU-based matrix-free finite element solver exploiting symmetry of elemental matrices  1,643 views

PIR: PMaC’s Idiom Recognizer  1,642 views

Two-Level Approach to Efficient Visualization of Protein Dynamics  1,642 views

Bridging the Gap between FPGAs and Multi-Processor Architectures: A Video Processing Perspective  1,642 views

Physically-based interactive schlieren flow visualization  1,641 views

HPP-Controller: An intra-node controller designed for connecting heterogeneous CPUs  1,641 views

Nonmetric Priors for Continuous Multilabel Optimization  1,641 views

Next-generation acceleration and code optimization for light transport in turbid media using GPUs  1,641 views

Considering GPGPU for HPC Centers: Is It Worth the Effort?  1,640 views

Implicit Parallel Time Integrators  1,640 views

A declarative API for particle systems  1,640 views

Maximize Performance on GPUs Using the Rake-based Optimization: A Case Study  1,640 views

Highly interactive computational steering for coupled 3D flow problems utilizing multiple GPUs  1,640 views

Dr.Jit: A Just-In-Time Compiler for Differentiable Rendering  1,640 views

Profile-guided optimization of critical medical imaging algorithms  1,640 views

GPUs: An Oasis in the Supercomputing Desert  1,639 views

OpenCL-based optimizations for acceleration of object tracking on FPGAs and GPUs  1,639 views

On-line free-viewpoint video: From single to multiple view rendering  1,639 views

The scoring sequences on profile Hidden Markov Models with delete states elimination by GPUs  1,639 views

Parallel Semi-Implicit Time Integrators  1,639 views

GPU Color Constancy  1,638 views

Towards Co-execution on Commodity Heterogeneous Systems: Optimizations for Time-Constrained Scenarios  1,638 views

BVH for efficient raytracing of dynamic metaballs on GPU  1,638 views

Memory-Scalable GPU Spatial Hierarchy Construction  1,638 views

Real-Time and Realistic Simulation for Cardiac Intervention with GPU  1,638 views

AutoFreeze: Automatically Freezing Model Blocks to Accelerate Fine-tuning  1,638 views

Real-Time Reconstruction of Sensitivity Encoded Radial Magnetic Resonance Imaging Using a Graphics Processing Unit  1,637 views

Software-Based Algorithm for Modeling and Correction of Gradient Nonlinearity Distortions in Magnetic Resonance Imaging  1,637 views

GPGPU Volume Classification using SimpleOpenCL  1,637 views

Experiences with Achieving Portability across Heterogeneous Architectures  1,637 views

Interactive Isosurfaces with Quadratic C1 Splines on Truncated Octahedral Partitions  1,637 views

The conjugate gradient solver accelerated by GPU for solving wave-propagation problems  1,637 views

Design and implementation of MPEG audio layer III decoder using graphics processing units  1,637 views

Importance-Driven Particle Techniques for Flow Visualization  1,637 views

Calculation of weight vectors for wideband beamforming using Graphics Processing Units  1,637 views

Transform Coding for Hardware-accelerated Volume Rendering  1,636 views

Hierarchical N-body simulations with auto-tuning for heterogeneous systems  1,636 views

A Data Communication Scheduler for Stream Programs on CPU-GPU Platform  1,636 views

A multi-agent architecture for scheduling of high performance services in a GPU cluster  1,635 views

GPU-Based Ray Tracing of Splats  1,635 views

Efficient On-the-fly Category Retrieval using ConvNets and GPUs  1,635 views

Unstructured grid applications on GPU: performance analysis and improvement  1,635 views

A pure vision-based approach to topological SLAM  1,635 views

Parallel Variable Distribution Algorithm for Constrained Optimization with Nonmonotone Technique  1,634 views

Acceleration of Agent-Based Pandemic Modeling on Multiple GPUs  1,634 views

Accelerating S3D: A GPGPU Case Study  1,634 views

A Static Analysis-based Cross-Architecture Performance Prediction Using Machine Learning  1,634 views

Computing Optimal Cycle Mean in Parallel on CUDA  1,634 views

Fast and informative flow simulations in a building by using fast fluid dynamics model on graphics processing unit  1,634 views

Integrating Two-Way Interaction Between Fluids and Rigid Bodies in the Real-Time Particle Systems Library  1,634 views

Approximate Belief Propagation by Hierarchical Averaging of Outgoing Messages  1,634 views

Improvement Study of EEMD Decomposition Efficiency Based on CUDA Architecture  1,633 views

GPU-Based Iterative Relative Fuzzy Connectedness Image Segmentation  1,633 views

Multikernel Data Partitioning With Channel on OpenCL-Based FPGAs  1,633 views

An Analytical Approach to the Design of Parallel Block Cipher Encryption/Decryption: A CPU/GPU Case Study  1,633 views

Hybrid Parallelism for Volume Rendering on Large, Multi-and Many-core Systems  1,633 views

Scene Recognition Acceleration Using CUDA and OpenMP  1,633 views

RDMA-Based Job Migration Framework for MPI over InfiniBand  1,633 views

Indexing of Spatiotemporal Trajectories for Efficient Distance Threshold Similarity Searches on the GPU  1,633 views

Improved Programming of GPU Architectures through Automated Data Allocation and Loop Restructuring  1,632 views

Learning Two-View Stereo Matching  1,632 views

Automatic generation of software pipelines for heterogeneous parallel systems  1,632 views

De-specializing an HLS library for Deep Neural Networks: improvements upon hls4ml  1,632 views

 

Brief statistics for this page

Titles: 100

Total views: 164020

 

Most viewed items:

Recent source codes

* * *

* * *

HGPU group © 2010-2025 hgpu.org

All rights belong to the respective authors

Contact us:

contact@hpgu.org