1173

Papers on hgpu.org (.txt-file)

Belief Propagation by Message Passing in Junction Trees: Computing Each Message Faster Using GPU Parallelization Download

Belief Propagation on the GPU for Stereo Vision Download

Believe it or Not! Multi-core CPUs Can Match GPU Performance for FLOP-intensive Application! Download

Bempp-cl: A fast Python based just-in-time compiling boundary element library Download Package

BenchDirect: A Directed Language Model for Compiler Benchmarks Download Package

BenchFriend: Correlating the Performance of GPU Benchmarks Download

BENCHIP: Benchmarking Intelligence Processors Download

Benchmarking a Proof-of-Concept Performance Portable SYCL-based Fast Fourier Transformation Library Download

Benchmarking Across Platforms: European Option Pricing Download

Benchmarking and Dissecting the Nvidia Hopper GPU Architecture Download

Benchmarking and Implementation of Probability-Based Simulations on Programmable Graphics Cards Download

Benchmarking and modelling of POWER7, Westmere, BG/P, and GPUs: an industry case study Download

Benchmarking and Optimization of Gradient Boosted Decision Tree Algorithms Download

Benchmarking Data Analysis and Machine Learning Applications on the Intel KNL Many-Core Processor Download

Benchmarking Deep Learning Models on Jetson TX2 Download Package

Benchmarking GPU and CPU codes for Heisenberg spin glass overrelaxation

Benchmarking GPU and TPU Performance with Graph Neural Networks Download

Benchmarking GPU Devices with N-Body Simulations Download

Benchmarking GPUs to tune dense linear algebra Download

Benchmarking Harp-DAAL: High Performance Hadoop on KNL Clusters Download Package

Benchmarking Intel Xeon Phi to Guide Kernel Design Download

Benchmarking Modern Edge Devices for AI Applications Download

Benchmarking Next Generation Hardware Platforms: An Experimental Approach Download

Benchmarking OpenCL, OpenACC, OpenMP, and CUDA: programming productivity, performance, and energy consumption Download

Benchmarking optimization algorithms for auto-tuning GPU kernels Download

Benchmarking Parallel Performance on Many-Core Processors Download

Benchmarking performance of a hybrid Xeon/Xeon Phi system for parallel computation of similarity measures between large vectors Download

Benchmarking State-of-the-Art Deep Learning Software Tools Download Package

Benchmarking the cost of thread divergence in CUDA Download

Benchmarking the Intel Xeon Phi Coprocessor Download

Benchmarking the Memory Hierarchy of Modern GPUs Download Package

Benchmarking the Nvidia GPU Lineage: From Early K80 to Modern A100 with Asynchronous Memory Transfers Download

Benchmarking TPU, GPU, and CPU Platforms for Deep Learning Download

Benchmarks Based on Anti-Parallel Patterns for the Evaluation of GPUs Download

Benchmarks for Intel MIC Architecture Download

BenchPress: A Deep Active Benchmark Generator Download Package

Berkeley Dwarfs on CUDA Download

Best bang for your buck: GPU nodes for GROMACS biomolecular simulations Download Package

Best Practice Guide – GPGPU Download

Best Practice Guide – Intel Xeon Phi Download

Best Practice Guide Intel Xeon Phi v2.0 Download

Best-effort semantic document search on GPUs

Betatron tune measurement with the LHC damper using a GPU Download

Better GPU Hash Tables Download

Better speedups using simpler parallel programming for graph connectivity and biconnectivity Download

Betweenness Centrality on GPUs and Heterogeneous Architectures Download Package

Beyond 16GB: Out-of-Core Stencil Computations Download

Beyond a Gaussian Denoiser: Residual Learning of Deep CNN for Image Denoising Download Package

Beyond Amdahl’s Law: An Objective Function That Links Multiprocessor Performance Gains To Delay and Energy Download

Beyond Desktop Computation: Challenges in Scaling a GPU Infrastructure Download

Beyond programmable shading (parts I and II) Download

Beyond Straightforward Vectorization of Lightweight Data Compression Algorithms for Larger Vector Sizes Download

BFROST: Binary Features from Robust Orientation Segment Tests accelerated on the GPU Download

Bi-directional Path Tracing on GPU Download Package

Bidimensional Median Filter for Parallel Computing Architectures Download

BIDMach: Large-scale Learning with Zero Memory Allocation Download Package

Bifrost: a Python/C++ Framework for High-Throughput Stream Processing in Astronomy Download Package

Big Integer Multiplication with CUDA FFT (cuFFT) Library Download

Bigger Buffer k-d Trees on Multi-Many-Core Systems Download Package

BigKernel — High Performance CPU-GPU Communication Pipelining for Big Data-style Applications Download

Bilateral Filtering with CUDA Download Package

Billion-scale similarity search with GPUs Download Package

Binary Code Summarization: Benchmarking ChatGPT/GPT-4 and Other Large Language Models Download

Binary Interval Search (BITS): A Scalable Algorithm for Counting Interval Intersections Download Package

Binary Interval Search: a scalable algorithm for counting interval intersections Download Package

Binary Mesh Partitioning for Cache-Efficient Visualization Download

Binary Segmentation of Video Sequences in Real Time

BinaryNet: Training Deep Neural Networks with Weights and Activations Constrained to +1 or -1 Download Package

Binaural Simulations Using Audio Rate FDTD Schemes and CUDA Download

Binomial American Option Pricing on CPU-GPU Hetergenous System Download

Bio-inspired computer visual system using GPU and Visual Pattern Assessment Language (ViPAL): Application on breast cancer prognosis Download

Bio-Inspired Optimization of Ultra-Wideband Patch Antennas Using Graphics Processing Unit Acceleration Download

Bio-sequence database scanning on a GPU Download

BioEM: GPU-accelerated computing of Bayesian inference of electron microscopy images Download Package

Bioinformatics Sequence Comparisons on Manycore Processors Download

Biomedical and Clinical English Model Packages in the Stanza Python NLP Library Download Package

Biomedical image analysis on a cooperative cluster of GPUs and multicores Download

Biomolecular electrostatics simulation with a parallel FMM-based BEM, using up to 512 GPUs Download Package

Biomolecular electrostatics using a fast multipole BEM on up to 512 GPUs and a billion unknowns Download Package

Bit-GraphBLAS: Bit-Level Optimizations of Matrix-Centric Graph Processing on GPU Download Package

Bit-level Parallelization of 3DES Encryption on GPU Download Package

Bit-Packed Damaged Lattice Potts Model Simulations with CUDA and GPUs Download

Bit-Parallel Multiple Pattern Matching Download

Bit-Vectorized GPU Implementation of a Stochastic Cellular Automaton Model for Surface Growth Download

Bitcoin and The Age of Bespoke Silicon Download

BitCracker: BitLocker meets GPUs Download Package

Bitmap Filter: Speeding up Exact Set Similarity Joins with Bitwise Operations Download

BlaBla: Linguistic Feature Extraction for Clinical Analysis in Multiple Languages Download Package

Black-Box Side-Channel Attacks Highlight the Importance of Countermeasures: An Analysis of the Xilinx Virtex-4 and Virtex-5 Bitstream Encryption Mechanism Download

BLAS Comparison on FPGA, CPU and GPU Download

Blasting through lattice calculations using CUDA Download

BLASX: A High Performance Level-3 BLAS Library for Heterogeneous Multi-GPU Computing Download Package

Blind image deconvolution algorithm on NVIDIA CUDA platform Download

Blink: Fast and Generic Collectives for Distributed ML Download

Blister: GPU-based rendering of Boolean combinations of free-form triangulated shapes Download

Block based Singular Value Decomposition approach to matrix factorization for recommender systems Download Package

Block Conjugate Gradient Solver in OpenCL Download

Block Time Step Storage Scheme for Astrophysical N-body Simulations Download

Block-asynchronous Multigrid Smoothers for GPU-accelerated Systems Download

Block-Parallel IDA* for GPUs Download Package

 

Brief statistics for this page

Titles: 100

Download open PDFs: 97

Package packages: 30

* * *

* * *

HGPU group © 2010-2024 hgpu.org

All rights belong to the respective authors

Contact us: