2402

Views of posts on hgpu.org

GPU Accelerated Greedy Algorithms for Compressed Sensing  3,025 views

libmolgrid: GPU Accelerated Molecular Gridding for Deep Learning Applications  3,024 views

Implementing Strassen’s Algorithm with CUTLASS on NVIDIA Volta GPUs  3,022 views

Hybrid CPU-GPU Implementation of Tracking-Learning-Detection Algorithm  3,022 views

GROMACS: High performance molecular simulations through multi-level parallelism from laptops to supercomputers  3,021 views

Accelerating the ANSYS Direct Sparse Solver with GPUs  3,019 views

A Parallel Algorithm of PCA-SIFT Based on CUDA  3,019 views

Programming Frameworks for Distributed Smartphone Computing  3,019 views

Optimized Broadcast for Deep Learning Workloads on Dense-GPU InfiniBand Clusters: MPI or NCCL?  3,019 views

KERNELGEN – A Toolchain for Automatic GPU-centric Applications Porting  3,017 views

A simple GPU-based approach for 3D Voronoi diagram construction and visualization  3,014 views

Support for Parallel Scan in OpenMP  3,014 views

Legolizer: A Real-Time System for Modeling and Rendering LEGO Representations of Boundary Models  3,013 views

A Hybrid Parallel Implementation of the Aho-Corasick and Wu-Manber Algorithms Using NVIDIA CUDA and MPI Evaluated on a Biological Sequence Database  3,012 views

The Accelerator Wall: Limits of Chip Specialization  3,011 views

SWM: Simplified Wu-Manber for GPU-based Deep Packet Inspection  3,010 views

Hybrid CPU-GPU Pipeline Framework  3,010 views

GPU Virtualization  3,010 views

Interleaving and Lock-Step Semantics for Analysis and Verification of GPU Kernels  3,010 views

A Case for Work-stealing on FPGAs with OpenCL Atomics  3,010 views

GPU Accelerated Face Detection (thesis)  3,009 views

A Data-Parallel Graphics Pipeline Implemented in OpenCL  3,008 views

Swendsen-Wang Multi-Cluster Algorithm for the 2D/3D Ising Model on Xeon Phi and GPU  3,008 views

Towards Interactive Visual Exploration of Parallel Programs using a Domain-specific Language  3,007 views

Fractal Based Method on Hardware Acceleration for Natural Environments  3,006 views

On the Fly Porn Video Blocking Using Distributed Multi-GPU and Data Mining Approach  3,006 views

A design case study: CPU vs. GPGPU vs. FPGA  3,006 views

GPU-Based Airway Tree Segmentation and Centerline Extraction  3,006 views

High Performance Histograms on SIMT and SIMD Architectures  3,006 views

A Hybrid Approach to Parallel Connected Component Labeling Using CUDA  3,005 views

Hardware Transactional Memory for GPU Architectures  3,004 views

A Predictive Model for Solving Small Linear Algebra Problems in GPU Registers  3,004 views

Swarm-NG: a CUDA Library for Parallel n-body Integrations with focus on Simulations of Planetary Systems  3,003 views

Computing Strongly Connected Components with CUDA  3,003 views

Introducing CURRENNT: The Munich Open-Source CUDA RecurREnt Neural Network Toolkit  3,002 views

Work-Efficient Parallel GPU Methods for Single-Source Shortest Paths  3,002 views

Multi-Scale, Multi-Level, Heterogeneous Features Extraction and Classification of Volumetric Medical Images  3,002 views

Molecular dynamics simulations through GPU video games technologies  3,001 views

3D tumor localization through real-time volumetric x-ray imaging for lung cancer radiotherapy  3,001 views

Understanding the efficiency of GPU algorithms for matrix-matrix multiplication  3,001 views

Face Recognition Using OpenCL  3,000 views

Theano-based Large-Scale Visual Recognition with Multiple GPUs  3,000 views

Celeris: A GPU-accelerated open source software with a Boussinesq-type wave solver for real-time, interactive simulation and visualization  3,000 views

Sailfish: a flexible multi-GPU implementation of the lattice Boltzmann method  2,999 views

OpenGL SuperBible: Comprehensive Tutorial and Reference (5th Edition)  2,998 views

Optimizing Linpack Benchmark on GPU-Accelerated Petascale Supercomputer  2,997 views

The GPU Computing Era  2,996 views

Rubus: A compiler for seamless and extensible parallelism  2,996 views

A short guide to CUDA C: For physicists with multi-core graphics cards  2,994 views

Fast Burrows Wheeler Compression Using CPU and GPU  2,994 views

FastTree: A Hardware KD-Tree Construction Acceleration Engine for Real-Time Ray Tracing  2,994 views

Adapting the GA Approach to Solve Traveling Salesman Problems on CUDA Architecture  2,993 views

Matrix Multiplication with CUDA – A basic introduction to the CUDA programming model  2,992 views

Anisotropic Kuwahara Filtering on the GPU  2,991 views

Parallelization of the Generalized Hough Transform on GPU  2,991 views

Demystifying GPU microarchitecture through microbenchmarking  2,989 views

Transparent CPU-GPU Collaboration for Data-Parallel Kernels on Heterogeneous Systems  2,989 views

A Comparison of Modern GPU and CPU Architectures: And the Common Convergence of Both  2,989 views

A Case Study in Using OpenCL on FPGAs: Creating an Open-Source Accelerator of the AutoDock Molecular Docking Software  2,988 views

3D GPU Architecture using Cache Stacking: Performance, Cost, Power and Thermal analysis  2,988 views

Comparative Study of Caffe, Neon, Theano, and Torch for Deep Learning  2,987 views

Machine Learning from Streaming Data in Heterogeneous Computing Environments  2,986 views

Design and Development of an Efficient H. 264 Video Encoder for CPU/GPU using OpenCL  2,986 views

Increasing GPU Throughput using Kernel Interleaved Thread Block Scheduling  2,985 views

cf4ocl: a C framework for OpenCL  2,985 views

GPU accelerated Monte Carlo simulation of the 2D and 3D Ising model  2,985 views

GPU-based ultrafast IMRT plan optimization  2,983 views

Finding Longest Common Subsequences by GPU-Based Parallel Ant Colony Optimization  2,983 views

Optimizing Performance of Recurrent Neural Networks on GPUs  2,982 views

A code motion technique for accelerating general-purpose computation on the GPU  2,982 views

Real-time Image Processing on Low Cost Embedded Computers  2,982 views

Improved Finite Difference Schemes for a 3-D Viscothermal Wave Equation on a GPU  2,980 views

GPU Array Access Auto-Tuning  2,980 views

On Vectorization of Deep Convolutional Neural Networks for Vision Tasks  2,979 views

GHOST: GPGPU-Offloaded High Performance Storage I/O Deduplication for Primary Storage System  2,977 views

Implementation of the SYCL Heterogeneous Computing Library  2,975 views

The Comparisons of OpenCL and OpenMP Computing Paradigm  2,973 views

Intel nGraph: An Intermediate Representation, Compiler, and Executor for Deep Learning  2,972 views

Scalable Kernel Fusion for Memory-Bound GPU Applications  2,970 views

Implementation of digital down converter in GPU  2,969 views

Achieving TeraCUPS on Longest Common Subsequence Problem using GPGPUs  2,968 views

FastMag: Fast micromagnetic simulator for complex magnetic structures  2,968 views

Dynamic Memory Allocation for OpenCL  2,968 views

Multi-Threaded Automatic Integration Using OpenMP and CUDA  2,967 views

XKaapi: A Runtime System for Data-Flow Task Programming on Heterogeneous Architectures  2,967 views

Molecular dynamics recipes for genome research  2,967 views

Sparser, Better, Faster GPU Parsing  2,967 views

Best Practice Guide Intel Xeon Phi v2.0  2,966 views

clMAGMA: High Performance Dense Linear Algebra with OpenCL  2,966 views

Using GPUs for Machine Learning Algorithms  2,966 views

Converting Data to Task-Parallelism by Rewrites  2,965 views

Data-Parallel Octrees for Surface Reconstruction  2,965 views

Hybrid GPU-Based Single- and Double-Bounce SAR Simulation  2,964 views

A framework to implement a multifrontal scheme on GPU architectures with OpenCL  2,963 views

Pseudorandom Numbers Generation for Monte Carlo Simulations on GPUs: OpenCL Approach  2,963 views

State Lattice-based Motion Planning for Autonomous On-Road Driving  2,962 views

High productivity multi-device exploitation with the Heterogeneous Programming Library  2,961 views

Estimation of Skin Optical Parameters for Real-Time Hyperspectral Imaging Applications using GPGPU Parallel Computing  2,960 views

Parallel Execution of AES-CTR Algorithm Using Extended Block Size  2,960 views

GeNN: a code generation framework for accelerated brain simulations  2,960 views

 

Brief statistics for this page

Titles: 100

Total views: 299183

 

Most viewed items:

* * *

* * *

HGPU group © 2010-2025 hgpu.org

All rights belong to the respective authors

Contact us:

contact@hpgu.org