2402

Views of posts on hgpu.org

Technical Report about Tiramisu: a Three-Layered Abstraction for Hiding Hardware Complexity from DSL Compilers  2,631 views

DeepBE: Learning Deep Binary Encoding for Multi-Label Classification  2,631 views

GPU-SPARC: Accelerating Parallelism in Multi-GPU Real-Time Systems  2,630 views

Optimizing Streaming Parallelism on Heterogeneous Many-Core Architectures  2,630 views

Accelerating Computational Finance Simulations with OpenCL  2,630 views

GPU Programming in a High Level Language: Compiling X10 to CUDA  2,629 views

Count Sort for GPU Computing  2,629 views

CUDAICA: GPU optimization of Infomax-ICA EEG analysis  2,628 views

Efficient Target and Application Specific Selection and Ordering of Compiler Passes  2,628 views

Single Server Multi-GPU Training of ConvNets  2,628 views

Molecular Simulations using CUDA  2,628 views

A Heterogeneous Accelerated Matrix Multiplication: OpenCL + APU + GPU+ Fast Matrix Multiply  2,628 views

Implementation of 2-D Discrete Cosine Transform Algorithm on GPU  2,628 views

The Dynamical Kernel Scheduler – Part 1  2,628 views

Computational advances in gravitational microlensing: a comparison of CPU, GPU, and parallel, large data codes  2,627 views

A GPU-based Simulation for Stochastic Computing  2,627 views

Optimising Unstructured Mesh Computational Fluid Dynamics Applications on Multicores via Machine Learning and Code Transformation  2,627 views

Efficient Execution of AMR Computations on GPU Systems  2,627 views

A Modular Framework for Deformation and Fracture using GPU Shaders  2,627 views

Development of a CUDA Implementation of the 3D FDTD Method  2,627 views

An Efficient Parallel GPU Evaluation of Small Angle X-Ray Scattering Profiles  2,627 views

Loo.py: transformation-based code generation for GPUs and CPUs  2,626 views

High performance conjugate gradient solver on multi-GPU clusters using hypergraph partitioning  2,626 views

GPU Accelerated Automated Feature Extraction from Satellite Images  2,626 views

NOVA: A Functional Language for Data Parallelism  2,626 views

liquidSVM: A Fast and Versatile SVM package  2,625 views

Implementation of FDTD-Compatible Green’s Function on Heterogeneous CPU-GPU Parallel Processing System  2,625 views

Characterization and Performance Analysis for 3D Benchmarks  2,624 views

Task-based FMM for heterogeneous architectures  2,624 views

Sapporo2: A versatile direct N-body library  2,624 views

The Ocean Tensor Package  2,623 views

Simulating the universe with GPU-accelerated supercomputers: n-body methods, tests, and examples  2,623 views

Heterogeneous parallel algorithms for Computational Fluid Dynamics on unstructured meshes  2,623 views

How to scale distributed deep learning?  2,622 views

HSApriori: High Speed Association Rule Mining using Apriori Based Algorithm for GPU  2,622 views

Hardware Acceleration for Unstructured Big Data and Natural Language Processing  2,622 views

A Performance Comparison of Sort and Scan Libraries for GPUs  2,622 views

A Survey Paper on Solving TSP using Ant Colony Optimization on GPU  2,622 views

Fast hough transform on GPUs: exploration of algorithm trade-offs  2,622 views

An Extension of the StarSs Programming Model for Platforms with Multiple GPUs  2,621 views

Accelerating Phylogenetic Inference on GPUs: an OpenACC and CUDA comparison  2,621 views

Modern Gyrokinetic Particle-In-Cell Simulation of Fusion Plasmas on Top Supercomputers  2,621 views

Comprehensive Evaluations of Cone-beam CT dose in Image-guided Radiation Therapy via GPU-based Monte Carlo simulations  2,621 views

The battle of the giants: a case study of GPU vs FPGA optimisation for real-time image processing  2,621 views

The Performance Analysis Based on Heterogeneous Parallel Processors for Anisotropic Diffusion Filters  2,621 views

Extending OmpSs for OpenCL kernel co-execution in heterogeneous systems  2,621 views

Purine: A bi-graph based deep learning framework  2,620 views

Applications of Deep Neural Networks  2,620 views

Tackling Exascale Software Challenges in Molecular Dynamics Simulations with GROMACS  2,620 views

High Performance Computing for Large Graphs of Internet Applications using GPU  2,619 views

GeauxDock: Accelerating Structure-Based Virtual Screening with Heterogeneous Computing  2,619 views

Performance Analysis Cluster and GPU Computing Environment on Molecular Dynamic Simulation of BRV-1 and REM2 with GROMACS  2,619 views

Computational Modelling of Galaxy Formation using FLAME GPU  2,619 views

ClearPath: highly parallel collision avoidance for multi-agent simulation  2,619 views

Initial condition for efficient mapping of level set algorithms on many-core architectures  2,619 views

Optimising Purely Functional GPU Programs  2,618 views

Automatic Performance Tuning of Pipeline Patterns for Heterogeneous Parallel Architectures  2,618 views

Accelerating simultaneous algebraic reconstruction technique with motion compensation using CUDA-enabled GPU  2,618 views

Energy Efficiency Analysis of GPUs  2,617 views

On Binaural Spatialization and the Use of GPGPU for Audio Processing  2,617 views

Regular Expression Matching on Graphics Hardware for Intrusion Detection  2,617 views

Parallel Statistical Multi-resolution Estimation  2,617 views

Non-local means denoising algorithm accelerated by GPU  2,617 views

Accelerating Ab Initio Nuclear Physics Calculations with GPUs  2,617 views

GPU accelerated feature algorithms for mobile devices  2,616 views

Designing a Unified Programming Model for Heterogeneous Machines  2,616 views

pyMIC: A Python Offload Module for the Intel Xeon Phi Coprocessor  2,616 views

JCudaMP: OpenMP/Java on CUDA  2,616 views

Analysis & Design of Efficient Cryptographic Systems  2,615 views

A dataflow-like programming model for future hybrid clusters  2,615 views

Object Oriented Framework for CUDA based Pyramidal Image Blending  2,614 views

Comparing GPU and CPU in OLAP Cubes Creation  2,614 views

Using Compute Unified Device Architecture (CUDA) in Parallelizing Different Digital Image Processing Techniques  2,614 views

Can GPGPU Programming Be Liberated from the Data-Parallel Bottleneck?  2,614 views

High-quality surface splatting on today’s GPUs  2,613 views

A Quantitative Study of Irregular Programs on GPUs  2,613 views

Improved GPU Co-processor Sorting Algorithm with Barrier Synchronization  2,613 views

Advanced Trends of Heterogeneous Computing with CPU-GPU Integration: Comparative Study  2,613 views

PFAC Library: GPU-based string matching algorithm  2,613 views

GPU-Accelerated BWT Construction for Large Collection of Short Reads  2,613 views

A Quantitative Performance Analysis Model for GPU Architectures  2,613 views

A High Performance Random Number Generator Using Heterogeneous Computing Platform  2,613 views

Precision and Performance: Floating Point and IEEE 754 Compliance for NVIDIA GPUs  2,612 views

Implementation of a Power Efficient Synthetic Aperture Radar Back Projection Algorithm on FPGAs Using OpenCL  2,612 views

Accelerating Cost Aggregation for Real-Time Stereo Matching  2,612 views

Parallelization techniques of the x264 video encoder  2,612 views

Parallelization of the Local Threshold and Boolean Function Based Edge Detection Algorithm Using CUDA  2,612 views

Architecture-Adaptive Code Variant Tuning  2,611 views

Evaluating the Power of GPU Acceleration for IDW Interpolation Algorithm  2,611 views

Parallel AES algorithm for fast Data Encryption on GPU  2,611 views

FPGA vs. multi-core CPUs vs. GPUs: hands-on experience with a sorting application  2,611 views

Parallel Game Tree Search Using GPU  2,610 views

Graphics Processing Unit (GPU) Implementation Methodology of AERMOD Model  2,610 views

Using P System with GPU Model to Design and Implement a Public Key Cryptography  2,609 views

A Parallel Recursive Approach for Solving All Pairs Shortest Path Problem on GPU using OpenCL  2,609 views

GPU-BSM: A GPU-Based Tool to Map Bisulfite-Treated Reads  2,608 views

VoxelPipe: a programmable pipeline for 3D voxelization  2,608 views

GPU-Accelerated Numerical Simulations of the Knudsen Gas on Time-Dependent Domains  2,608 views

Fast parallel surface and solid voxelization on GPUs  2,608 views

K-nearest neighbor search: Fast GPU-based implementations and application to high-dimensional feature matching  2,608 views

 

Brief statistics for this page

Titles: 100

Total views: 261935

 

Most viewed items:

Recent source codes

* * *

* * *

HGPU group © 2010-2025 hgpu.org

All rights belong to the respective authors

Contact us:

contact@hpgu.org