2402

Views of posts on hgpu.org

Parallel Verlet neighbor list algorithm for GPU-optimized MD simulations  2,278 views

Performance Evaluation of the Intel Many Integrated Core Architecture for 3D Image Reconstruction in Computed Tomography  2,277 views

CLTune: A Generic Auto-Tuner for OpenCL Kernels  2,277 views

Implementation of 2-D Discrete Cosine Transform Algorithm on GPU  2,276 views

High Performance Computing for solving large sparse systems. Optical Diffraction Tomography as a case of study  2,276 views

Optimising Unstructured Mesh Computational Fluid Dynamics Applications on Multicores via Machine Learning and Code Transformation  2,276 views

GPflow: A Gaussian process library using TensorFlow  2,276 views

GPU System Call  2,275 views

A block-asynchronous relaxation method for graphics processing units  2,275 views

Lost in Abstraction: Pitfalls of Analyzing GPUs at the Intermediate Language Level  2,275 views

Extending the Generalized Fermat Prime Number Search Beyond One Million Digits Using GPUs  2,274 views

Multiresolution Flow Simulations on Multi/many-core Architectures  2,274 views

YaDiV-an open platform for 3D visualization and 3D segmentation of medical data  2,274 views

Fast hough transform on GPUs: exploration of algorithm trade-offs  2,273 views

FluidFFT: common API (C++ and Python) for Fast Fourier Transform HPC libraries  2,272 views

3D Modeling, Distance and Gradient Computation for Motion Planning: A Direct GPGPU Approach  2,271 views

GPUVerify: A Verifier for GPU Kernels  2,271 views

Parallel Batch Training of the Self-Organizing Map Using OpenCL  2,271 views

Cache Miss Analysis for GPU Programs Based on Stack Distance Profile  2,271 views

The Heisenberg spin glass model on GPU: myths and actual facts  2,271 views

Use NVIDIA CUDA technology to create genetic algorithms with extensive population  2,270 views

Accelerating Ant Colony Optimization-based Edge Detection on the GPU using CUDA  2,270 views

A Monte Carlo Neutron Transport Code for Eigenvalue Calculations on a Dual-GPU System and CUDA Environment  2,270 views

Single Server Multi-GPU Training of ConvNets  2,270 views

Advanced Trends of Heterogeneous Computing with CPU-GPU Integration: Comparative Study  2,270 views

Projected tetrahedra revisited: a barycentric formulation applied to digital radiograph reconstruction using higher-order attenuation functions  2,269 views

Sapporo2: A versatile direct N-body library  2,269 views

Parallel Matching and Clustering Algorithms on GPUs  2,269 views

Improving GPU Performance Prediction with Data Transfer Modeling  2,268 views

Grid-based SAH BVH construction on a GPU  2,268 views

All-Pairs Shortest Path Algorithms Using CUDA  2,267 views

Fast Implementation of DGEMM on Fermi GPU  2,267 views

GPUvm: Why Not Virtualizing GPUs at the Hypervisor?  2,267 views

A Heterogeneous Accelerated Matrix Multiplication: OpenCL + APU + GPU+ Fast Matrix Multiply  2,266 views

A Survey Paper on Solving TSP using Ant Colony Optimization on GPU  2,266 views

Multi-core CPU or GPU-accelerated Multiscale Modeling for Biomolecular Complexes  2,266 views

Energy Efficient Computing on Multi-core Processors: Vectorization and Compression Techniques  2,266 views

CUDAICA: GPU optimization of Infomax-ICA EEG analysis  2,266 views

GPU accelerated feature algorithms for mobile devices  2,266 views

Graph Processing on GPUs: A Survey  2,265 views

Machine Learning in Compilers: Past, Present and Future  2,265 views

GPU’s for event reconstruction in the FairRoot framework  2,265 views

A Comparative Measurement Study of Deep Learning as a Service Framework  2,264 views

Structural Agnostic SpMV: Adapting CSR-Adaptive for Irregular Matrices  2,264 views

NOVA: A Functional Language for Data Parallelism  2,264 views

Using Compute Unified Device Architecture (CUDA) in Parallelizing Different Digital Image Processing Techniques  2,263 views

Mars: Accelerating MapReduce with Graphics Processors  2,263 views

Implementation of a High Throughput 3GPP Turbo Decoder on GPU  2,263 views

A Cross-platform Evaluation of Graphics Shader Compiler Optimization  2,262 views

Barra, a Modular Functional GPU Simulator for GPGPU  2,262 views

CUED-RNNLM – An Open-Source Toolkit for Efficient Training and Evaluation of Recurrent Neural Network Language Models  2,262 views

wav2letter++: The Fastest Open-source Speech Recognition System  2,262 views

Parallel Statistical Multi-resolution Estimation  2,261 views

Sylkan: Towards a Vulkan Compute Target Platform for SYCL  2,261 views

Efficient Irregular Wavefront Propagation Algorithms on Hybrid CPU-GPU Machines  2,261 views

Learning Structured Sparsity in Deep Neural Networks  2,261 views

A closer look at GPUs  2,261 views

VoxelPipe: a programmable pipeline for 3D voxelization  2,260 views

Behavioral Non-portability in Scientific Numeric Computing  2,260 views

Image Processing with CUDA  2,260 views

Efficient Execution of AMR Computations on GPU Systems  2,260 views

How to distribute most efficiently a computation intensive calculation on an Android device to external compute units with an Android API  2,259 views

Speculative Execution on Multi-GPU Systems  2,259 views

DeepMon: Mobile GPU-based Deep Learning Framework for Continuous Vision Applications  2,259 views

Scalability of Self-organizing Maps on a GPU cluster using OpenCL and CUDA  2,259 views

Fast Simulation of Large-Scale Floods Based on GPU Parallel Computing  2,259 views

GPU-Accelerated Numerical Simulations of the Knudsen Gas on Time-Dependent Domains  2,258 views

GPGPU-Sim  2,258 views

A Generic Inverted Index Framework for Similarity Search on the GPU  2,258 views

The Ocean Tensor Package  2,258 views

Modeling Deep Learning Accelerator Enabled GPUs  2,258 views

GPU Ray Marching for Real-Time Rendering of Participating Media  2,257 views

Effective GPU Strategies for LU Decomposition  2,257 views

GPU-Accelerated BWT Construction for Large Collection of Short Reads  2,256 views

Massively parallel simulations of relativistic fluid dynamics on graphics processing units with CUDA  2,256 views

GPU Programming in a High Level Language: Compiling X10 to CUDA  2,256 views

Nodal Discontinuous Galerkin Methods on Graphics Processors  2,256 views

GPU Accelerated Particle Visualization with Splotch  2,255 views

pyMIC: A Python Offload Module for the Intel Xeon Phi Coprocessor  2,255 views

Enhancing Efficiency of the RRTMG Radiation Code with GPU and MIC Approaches for Numerical Weather Prediction Models  2,255 views

A comparative analysis of the performance and deployment overhead of parallelized Finite Difference Time Domain (FDTD) algorithms on a selection of high performance multiprocessor computing systems  2,255 views

Comparing Parallel Hardware Architectures for Visually Guided Robot Navigation  2,255 views

Deep Learning for Computational Chemistry  2,255 views

Performance of Kepler GTX Titan GPUs and Xeon Phi System  2,255 views

GPU-accelerated Gibbs Sampling  2,254 views

GPU accelerated toolbox for real-time beam-shaping in multimode fibres  2,254 views

Toolflows for Mapping Convolutional Neural Networks on FPGAs: A Survey and Future Directions  2,254 views

Chrono: a parallel multi-physics library for rigid-body, flexible-body, and fluid dynamics  2,254 views

GPU packet classification using OpenCL: a consideration of viable classification methods  2,253 views

Implementation of FDTD-Compatible Green’s Function on Heterogeneous CPU-GPU Parallel Processing System  2,253 views

Parallel LZ77 Decoding using a GPU  2,253 views

Adaboost GPU-based Classifier for Direct Volume Rendering  2,252 views

A High Performance Random Number Generator Using Heterogeneous Computing Platform  2,252 views

Deep Speech 2: End-to-End Speech Recognition in English and Mandarin  2,252 views

Simulating the universe with GPU-accelerated supercomputers: n-body methods, tests, and examples  2,252 views

AMGCL: an Efficient, Flexible, and Extensible Algebraic Multigrid Implementation  2,252 views

A Compiler and Runtime for Heterogeneous Computing  2,252 views

Making Human Connectome Faster: GPU Acceleration of Brain Network Analysis  2,251 views

Fast parallel Particle-To-Grid interpolation for plasma PIC simulations on the GPU  2,251 views

An MPI-Based Python Framework for Distributed Training with Keras  2,250 views

 

Brief statistics for this page

Titles: 100

Total views: 226289

 

Most viewed items:

* * *

* * *

HGPU group © 2010-2024 hgpu.org

All rights belong to the respective authors

Contact us: