high performance computing on graphics processing units: hgpu.org

hgpu.org » DSP

Parallel programming in mobile devices with FancyJCL

Sergio Afonso, Óscar Gómez-Cárdenes, Paula Expósito, Vicente Blanco, Francisco Almeida

View

Download (PDF)

Source codes

Tags: Computer science, DSP, Heterogeneous systems, Image processing, Java, OpenCL, Package

March 3, 2024 by hgpu

eGPU: A 750 MHz Class Soft GPGPU for FPGA

Martin Langhammer, George Constantinides

View

Download (PDF)

Tags: Computer science, DSP, FPGA, Hardware Architecture

July 24, 2023 by hgpu

Compilation and Design Space Exploration of Dataflow Programs for Heterogeneous CPU-GPU Platforms

Aurélien François Gilbert Bloch

View

Download (PDF)

Source codes

Tags: Compilers, Computer science, CUDA, Design space exploration, DSP, FPGA, Heterogeneous systems, nVidia, nVidia GeForce GTX 1660, nVidia GeForce RTX 3080 Ti, Package, Thesis

June 25, 2023 by hgpu

FPGA Implementation of Bluetooth Low Energy Physical Layer with OpenCL

Ganyong Mo

View

Download (PDF)

Tags: DSP, FPGA, OpenCL, Signal processing, Thesis

July 10, 2022 by hgpu

Fast Arbitrary Precision Floating Point on FPGA

Johannes de Fine Licht, Christopher A. Pattison, Alexandros Nikolaos Ziogas, David Simmons-Duffin, Torsten Hoefler

View

Download (PDF)

Source codes

Tags: Computer science, DSP, FPGA, Matrix multiplication, Package

April 17, 2022 by hgpu

MQBench: Towards Reproducible and Deployable Model Quantization Benchmark

Yuhang Li, Mingzhu Shen, Jian Ma, Yan Ren, Mingxin Zhao, Qi Zhang, Ruihao Gong, Fengwei Yu, Junjie Yan

View

Download (PDF)

Source codes

Tags: Algorithms, ASIC, Benchmarking, Computer science, Deep learning, DSP, Package

November 14, 2021 by hgpu

Thermal Safety and Real-Time Predictability on Heterogeneous Embedded SoC Platforms

Seyed Mehdi Hosseini Motlagh

View

Download (PDF)

Tags: Computer science, DSP, Heterogeneous systems, SoC, Thesis

January 3, 2021 by hgpu

Systolic-CNN: An OpenCL-defined Scalable Run-time-flexible FPGA Accelerator Architecture for Accelerating Convolutional Neural Network Inference in Cloud/Edge Computing

Akshay Dua, Yixing Li, Fengbo Ren

View

Download (PDF)

Source codes

Tags: Cloud, Computer science, DSP, FPGA, Hardware Architecture, Neural networks, OpenCL, Package

December 13, 2020 by hgpu

Using Machine Learning to Estimate Utilization and Throughput for OpenCL-Based SpMV Implementation on an FPGA

Jannatun Naher, Clay Gloster, Shrikant S. Jadhav, Christopher C. Doss

View

Download (PDF)

Tags: Computer science, Design space exploration, DSP, FPGA, Linear Algebra, Machine learning, OpenCL, Sparse matrix

April 12, 2020 by hgpu

Hardware Implementation and Quantization of Tiny-Yolo-v2 using OpenCL

Yap June Wai, Zulkalnain bin Mohd Yussof, Sani Irwan bin Md Salim

View

Download (PDF)

Tags: Computer science, Computer vision, Deep learning, DSP, FPGA, Neural networks, OpenCL

January 19, 2020 by hgpu

Software Compilation Techniques for Heterogeneous Embedded Multi-Core Systems

Rainer Leupers, Miguel Angel Aguilar, Jeronimo Castrillon, Weihua Sheng

View

Download (PDF)

Tags: ARM, Code generation, Compilers, Computer science, DSP, Heterogeneous systems

June 16, 2019 by hgpu

Accelerating ternary quantized convolutional neural networks using OpenCL for FPGA

Victor Joos de ter Beerst, Antoine Vanderschueren

View

Download (PDF)

Source codes

Tags: Computer science, DSP, FPGA, Neural networks, OpenCL, Package, Thesis

March 24, 2019 by hgpu

gpu_tracker: Context manager and CLI that tracks the computational-resource-usage of a code block or shell command, particularly the GPU usage

gpu_tracker: Python package for tracking and profiling GPU utilization in both desktop and high-performance computing environments

LOOPer: a polyhedral compiler for expressing fast and portable data parallel algorithms

LOOPer: A Learned Automatic Code Optimizer For Polyhedral Compilers

OpenMC Monte Carlo Code

Performance Portable Monte Carlo Particle Transport on Intel, NVIDIA, and AMD GPUs

Polygeist: C/C++ frontend for MLIR

Retargeting and Respecializing GPU Workloads for Performance Portability

Parallel Gaussian process with kernel approximation in CUDA

Optical flow algorithms for SYCL

SYCL in the edge: performance and energy evaluation for heterogeneous acceleration

OpenMP5-Offload-OpenMC-Intel-PVC

Distributed OpenMP Offloading of OpenMC on Intel GPU MAX Accelerators

See all packages

* * *

high performance computing on graphics processing units: hgpu.org

Parallel programming in mobile devices with FancyJCL

eGPU: A 750 MHz Class Soft GPGPU for FPGA

Compilation and Design Space Exploration of Dataflow Programs for Heterogeneous CPU-GPU Platforms

FPGA Implementation of Bluetooth Low Energy Physical Layer with OpenCL

Fast Arbitrary Precision Floating Point on FPGA

MQBench: Towards Reproducible and Deployable Model Quantization Benchmark

Thermal Safety and Real-Time Predictability on Heterogeneous Embedded SoC Platforms

Using Machine Learning to Estimate Utilization and Throughput for OpenCL-Based SpMV Implementation on an FPGA

Hardware Implementation and Quantization of Tiny-Yolo-v2 using OpenCL

Software Compilation Techniques for Heterogeneous Embedded Multi-Core Systems

Accelerating ternary quantized convolutional neural networks using OpenCL for FPGA

Recent source codes

QArray

Celerity: High-level C++ for Accelerator Clusters

CIFAR-10 Airbench: 94% on CIFAR-10 in 3.29 second

gpu_tracker: Context manager and CLI that tracks the computational-resource-usage of a code block or shell command, particularly the GPU usage

LOOPer: a polyhedral compiler for expressing fast and portable data parallel algorithms

OpenMC Monte Carlo Code

Polygeist: C/C++ frontend for MLIR

Parallel Gaussian process with kernel approximation in CUDA

Optical flow algorithms for SYCL

OpenMP5-Offload-OpenMC-Intel-PVC

Most viewed papers (last 30 days)