high performance computing on graphics processing units: hgpu.org

hgpu.org » Elementary functions

Elementary functions: towards automatically generated, efficient, and vectorizable implementations

Hugues De Lassus Saint-Genies

View

Download (PDF)

Tags: Algorithms, Code generation, Computer science, Elementary functions, Performance, Thesis

July 28, 2018 by hgpu

GPU-accelerated generation of correctly-rounded elementary functions

Pierre Fortin, Mourad Gouicem, Stef Graillat

View

Download (PDF)

Tags: Algorithms, Computer science, CUDA, Elementary functions, nVidia, Tesla C2070

June 2, 2013 by hgpu

Correctly rounding elementary functions on GPU

Pierre Fortin, Mourad Gouicem, Stef Graillat

View

Download (PDF)

Tags: Algorithms, Computer science, CUDA, Elementary functions, nVidia, Tesla C2070

November 14, 2012 by hgpu

GMP implementation on CUDA – A Backward Compatible Design With Performance Tuning

Hao Jun Liu, Chu Tong

View

Download (PDF)

Tags: Algorithms, Computer science, CUDA, Elementary functions, nVidia, nVidia GeForce GTX 280, Performance, Security

February 7, 2012 by hgpu

Towards solving the Table Maker’s Dilemma on GPU

Pierre Fortin, Mourad Gouicem, Stef Graillat

View

Download (PDF)

Tags: Algorithms, Computer science, CUDA, Elementary functions, nVidia, Tesla C2070

November 23, 2011 by hgpu

Unified Tables for Exponential and Logarithm Families

Christopher Kumar Anand, Anuroop Sharma

View

Download (PDF)

Tags: Algorithms, Cell processor, Computer science, Elementary functions, Mathematics

September 11, 2011 by hgpu

Efficient evaluation methods of elementary functions suitable for SIMD computation

Naoki Shibata

Tags: Algorithms, Computer science, Elementary functions

November 20, 2010 by hgpu

SimSYCL: Synchronous, single-threaded, library-only SYCL implementation for debugging and verification

SimSYCL: A SYCL Implementation Targeting Development, Debugging, Simulation and Conformance

gpu_tracker: Context manager and CLI that tracks the computational-resource-usage of a code block or shell command, particularly the GPU usage

gpu_tracker: Python package for tracking and profiling GPU utilization in both desktop and high-performance computing environments

CIFAR-10 Airbench: 94% on CIFAR-10 in 3.29 second

94% on CIFAR-10 in 3.29 Seconds on a Single GPU

LOOPer: a polyhedral compiler for expressing fast and portable data parallel algorithms

LOOPer: A Learned Automatic Code Optimizer For Polyhedral Compilers

OpenMC Monte Carlo Code

Performance Portable Monte Carlo Particle Transport on Intel, NVIDIA, and AMD GPUs

Polygeist: C/C++ frontend for MLIR

Retargeting and Respecializing GPU Workloads for Performance Portability

Parallel Gaussian process with kernel approximation in CUDA

See all packages

* * *

high performance computing on graphics processing units: hgpu.org

Elementary functions: towards automatically generated, efficient, and vectorizable implementations

GPU-accelerated generation of correctly-rounded elementary functions

Correctly rounding elementary functions on GPU

GMP implementation on CUDA – A Backward Compatible Design With Performance Tuning

Towards solving the Table Maker’s Dilemma on GPU

Unified Tables for Exponential and Logarithm Families

Efficient evaluation methods of elementary functions suitable for SIMD computation

Recent source codes

SimSYCL: Synchronous, single-threaded, library-only SYCL implementation for debugging and verification

GPU plugin for PySCF

QArray

Celerity: High-level C++ for Accelerator Clusters

gpu_tracker: Context manager and CLI that tracks the computational-resource-usage of a code block or shell command, particularly the GPU usage

CIFAR-10 Airbench: 94% on CIFAR-10 in 3.29 second

LOOPer: a polyhedral compiler for expressing fast and portable data parallel algorithms

OpenMC Monte Carlo Code

Polygeist: C/C++ frontend for MLIR

Parallel Gaussian process with kernel approximation in CUDA

Most viewed papers (last 30 days)