2402

Views of posts on hgpu.org

Usable assembly language for GPUs: a success story  2,418 views

Automatic Data Layout Generation and Kernel Mapping for CPU+GPU Architectures  2,418 views

Semi-Global Filtering of Airborne LiDAR Data for Fast Extraction of Digital Terrain Models  2,418 views

Pangolin: An Efficient and Flexible Graph Mining System on CPU and GPU  2,418 views

Parallel Neutrino Triggers using GPUs for an underwater telescope  2,418 views

Optimization Techniques for Mapping Algorithms and Applications onto CUDA GPU Platforms and CPU-GPU Heterogeneous Platforms  2,417 views

An extended GPU radiosity solver  2,417 views

DeepSmith: Compiler Fuzzing through Deep Learning  2,417 views

Removing the Barrier for FPGA-Based OpenCL Data Center Servers  2,417 views

Faster Algorithms for RNA-folding using the Four-Russians method  2,417 views

CULA: hybrid GPU accelerated linear algebra routines  2,417 views

Developing a CUDA solver for large sparse matrices for MARIN  2,417 views

A comprehensive study of Dynamic Memory Management in OpenCL kernels  2,417 views

Efficient visual hull computation for real-time 3D reconstruction using CUDA  2,417 views

OpenCL Implementation of a Parallel Universal Kriging Algorithm for Massive Spatial Data Interpolation on Heterogeneous Systems  2,417 views

Real-Time Adaptive Image Compression  2,417 views

PATUS: A Code Generation and Autotuning Framework For Parallel Iterative Stencil Computations on Modern Microarchitectures  2,416 views

GPU-accelerated discontinuous Galerkin methods on hybrid meshes  2,416 views

Performance Optimization Using Partitioned SpMV on GPUs and Multicore CPUs  2,416 views

Verification of Producer-Consumer Synchronization in GPU Programs  2,416 views

Solving the Boltzmann Equation on GPU  2,416 views

Graphic Processing Unit Simulation of Axon Growth and Guidance through Cue Diffusion on Massively Parallel Processors  2,416 views

A scalable hybrid algorithm based on domain decomposition and algebraic multigrid for solving partial differential equations on a cluster of CPU/GPUs  2,415 views

Numerical Model of Shallow Water: the Use of NVIDIA CUDA Graphics Processors  2,415 views

Implementation and performance evaluation of a GPU particle-in-cell code  2,415 views

OpenGL(R) ES 2.0 Programming Guide  2,415 views

Preconditioned conjugate gradient solver for structural problems  2,415 views

Parallel Graph Mining with GPUs  2,414 views

Interventional 4-D Motion Estimation and Reconstruction of Cardiac Vasculature without Motion Periodicity Assumption  2,414 views

Complexity Analysis and Algorithm Design for Reorganizing Data to Minimize Non-Coalesced Memory Accesses on GPU  2,414 views

Performance Optimization of Clustering On GPU  2,414 views

Speeding up a Video Summarization Approach Using GPUs and Multicore CPUs  2,414 views

A Coarse Grain Reconfigurable Architecture for sequence alignment problems in bio-informatics  2,414 views

A parallel algorithm for the constrained shortest path problem on lattice graphs  2,414 views

fMRI analysis on the GPU-possibilities and challenges  2,414 views

Solving lattice QCD systems of equations using mixed precision solvers on GPUs  2,414 views

Fast calculation of HELAS amplitudes using graphics processing unit (GPU)  2,413 views

COVRA: A compression-domain output-sensitive volume rendering architecture based on a sparse representation of voxel blocks  2,413 views

Methods for GPU Acceleration of Big Data Applications  2,413 views

Near real-time Fast Bilateral Stereo on the GPU  2,413 views

A performance/cost evaluation for a GPU-based drug discovery application on volunteer computing  2,413 views

Exploring Design Space of 3D NVM and eDRAM Caches Using DESTINY Tool (open-source code)  2,413 views

Task Performance with List-Mode Data  2,413 views

Solutions For Optimizing The Radix Sort Algorithmic Function Using The Compute Unified Device Architecture  2,413 views

Comprehensive Analysis of High-Performance Computing Methods for Filtered Back-Projection  2,413 views

GPU Programming – Speeding Up the 3D Surface Generator VESTA  2,412 views

FPGA Based Implementation of Deep Neural Networks Using On-chip Memory Only  2,412 views

Enabling OpenCL on a Configurable, VLIW Chip-Multiprocessor  2,412 views

Equalizer 2.0 – Convergence of a Parallel Rendering Framework  2,412 views

The Graphics Processor as a Mathematical Coprocessor in MATLAB  2,412 views

GPU Accelerated Randomized Singular Value Decomposition and Its Application in Image Compression  2,412 views

Taichi: A Language for High-Performance Computation on Spatially Sparse Data Structures  2,411 views

gEMpicker: A Highly Parallel GPU-Accelerated Particle Picking Tool for Cryo-Electron Microscopy  2,411 views

GPGPU accelerated optimization method of Interconnection Network Topology  2,411 views

Directive-Based, High-Level Programming and Optimizations for High-Performance Computing with FPGAs  2,411 views

An Automatic Input-Sensitive Approach for Heterogeneous Task Partitioning  2,411 views

SIMD-Based Large-Scale Transient Stability Simulation on the Graphics Processing Unit  2,411 views

Adaptive Multi-GPU Exchange Monte Carlo for the 3D Random Field Ising Model  2,410 views

Performance and Quality of Random Number Generators  2,410 views

Optimizing Symmetric Dense Matrix-Vector Multiplication on GPUs  2,410 views

Physically Based Rendering: Implementation of Path Tracer  2,410 views

GPU Gems: Programming Techniques, Tips and Tricks for Real-Time Graphics  2,410 views

Fast Multipole Method vs. Spectral Method for the Simulation of Isotropic Turbulence on GPUs  2,410 views

Optimizing Web Virtual Reality  2,410 views

A GPU-based Parallel Fireworks Algorithm for Optimization  2,410 views

OpenDwarfs: Characterization of Dwarf-Based Benchmarks on Fixed and Reconfigurable Architectures  2,410 views

LoGV: Low-overhead GPGPU Virtualization  2,410 views

Parallel GPU Implementation of Hough Transform for Circles  2,409 views

Evaluating the Performance and Portability of OpenCL  2,409 views

Raising the Performance of the Tinker-HP Molecular Modeling Package on Intel’s HPC Architectures: a Living Review [Article v1.0]  2,409 views

On the Portability of CPU-Accelerated Applications via Automated Source-to-Source Translation  2,409 views

Cryptanalysis of the Full AES Using GPU-Like Special-Purpose Hardware  2,409 views

Towards a Distributed GPU-Accelerated Matrix Inversion  2,408 views

Applying the Parallel GPU Model to Radiation Therapy Treatment  2,408 views

A Comparative Study of Neighborhood Filters for Artifact Reduction in Iterative Low-Dose CT  2,408 views

Implementation of a multigrid solver on GPU for Stokes equations with strongly variable viscosity based on Matlab and CUDA  2,408 views

Dynamic Task-Scheduling and Resource Management for GPU Accelerators in Medical Imaging  2,408 views

GPU-to-GPU and Host-to-Host multipattern string matching on a GPU  2,408 views

Optimized Composition: Generating Efficient Code for Heterogeneous Systems from Multi-Variant Components, Skeletons and Containers  2,408 views

OmpSs task offload  2,407 views

Optimizing Communication for Clusters of GPUs  2,407 views

dMath: A Scalable Linear Algebra and Math Library for Heterogeneous GP-GPU Architectures  2,407 views

SWAPHI: Smith-Waterman Protein Database Search on Xeon Phi Coprocessors  2,407 views

Accelerating CNN inference on FPGAs: A Survey  2,407 views

Computer Vision Accelerators for Mobile Systems based on OpenCL GPGPU Co-Processing  2,407 views

LBM based flow simulation using GPU computing processor  2,407 views

Performance Comparison of GPUs with a Genetic Algorithm based on CUDA  2,407 views

Adaptive Dynamic Load Balancing in Heterogeneous Multiple GPUs-CPUs Distributed Setting: Case Study of B&B Tree Search  2,407 views

Parallel Programming using OpenCL on Modern Architectures  2,406 views

Improving the Programmability of GPU Architectures  2,406 views

Displacement Mapping on the GPU – State of the Art  2,405 views

A parallel search tree algorithm for vertex cover on graphical processing units  2,405 views

Automated image alignment for 2D gel electrophoresis in a high-throughput proteomics pipeline  2,405 views

Learning to Detect Roads in High-Resolution Aerial Images  2,405 views

Performance and Efficiency Analysis of Modern Accelerators: Fine-Grained Parallelism on the Intel Xeon Phi  2,405 views

The future of microprocessors  2,405 views

Accelerated cryo-EM structure determination with parallelisation using GPUs in relion-2  2,404 views

Fine-Granular Parallel EBCOT and Optimization with CUDA for Digital Cinema Image Compression  2,404 views

Liszt: A Domain Specific Language for Building Portable Mesh-based PDE Solvers  2,404 views

Implementing QR Factorization Updating Algorithms on GPUs  2,404 views

 

Brief statistics for this page

Titles: 100

Total views: 241151

 

Most viewed items:

Recent source codes

* * *

* * *

HGPU group © 2010-2025 hgpu.org

All rights belong to the respective authors

Contact us:

contact@hpgu.org