Computer science | Page 175

Applications

hgpu.org » Applications » Computer science

CLBlast: A Tuned OpenCL BLAS Library

Cedric Nugteren

View

Download (PDF)

Source codes

Tags: AMD Radeon R9 M370X, ARM, ATI, BLAS, Computer science, Intel HD 5100, Linear Algebra, Machine learning, nVidia, nVidia GeForce GTX 750 Ti, nVidia GeForce GTX Titan X, OpenCL, Package

May 18, 2017 by hgpu

Group Marching Tree: Sampling-Based Approximately Optimal Motion Planning on GPUs

Brian Ichter, Edward Schmerling, Marco Pavone

View

Download (PDF)

Tags: Algorithms, Computer science, CUDA, nVidia, nVidia GeForce GTX 980, nVidia Tegra TX1, Robotics

May 18, 2017 by hgpu

Efficient Parallel Methods for Deep Reinforcement Learning

Alfredo V. Clemente, Humberto N. Castejon, Arjun Chandra

View

Download (PDF)

Source codes

Tags: Algorithms, Computer science, CUDA, Deep learning, nVidia, nVidia GeForce GXT 980 Ti, Package, Python, TensorFlow

May 18, 2017 by hgpu

Real-Time Adaptive Image Compression

Oren Rippel, Lubomir Bourdev

View

Download (PDF)

Tags: Algorithms, Compression, Computer science, Computer vision, Machine learning, nVidia, nVidia GeForce GTX 980 Ti

May 18, 2017 by hgpu

Block-Parallel IDA* for GPUs

Satoru Horie, Alex Fukunaga

View

Download (PDF)

Source codes

Tags: Artificial intelligence, Computer science, CUDA, nVidia, nVidia GRID K520, Package

May 11, 2017 by hgpu

A Design Methodology for Efficient Implementation of Deconvolutional Neural Networks on an FPGA

Xinyu Zhang, Srinjoy Das, Ojash Neopane, Ken Kreutz-Delgado

View

Download (PDF)

Tags: Algorithms, Computer science, Computer vision, Deep learning, DSP, FPGA, Neural networks

May 11, 2017 by hgpu

DeepMetabolism: A Deep Learning System to Predict Phenotype from Genome Sequencing

Weihua Guo, You Xu, Xueyang Feng

View

Download (PDF)

Source codes

Tags: Biology, Computer science, Deep learning, Medicine, nVidia, nVidia GeForce GTX 1080, Package, Python, TensorFlow

May 11, 2017 by hgpu

Resource-Aware Just-in-Time OpenCL Compiler for Coarse-Grained FPGA Overlays

Abhishek Kumar Jain, Douglas L. Maskell, Suhaib A. Fahmy

View

Download (PDF)

Tags: ARM, Computer science, DSP, FPGA, Heterogeneous systems, OpenCL

May 11, 2017 by hgpu

Towards Enhancing Performance, Programmability, and Portability in Heterogeneous Computing

Konstantinos Krommydas

View

Download (PDF)

Source codes

Tags: ATI, ATI Radeon HD 6550, ATI Radeon HD 7660, ATI Radeon HD 7970, Code generation, Compilers, Computer science, FPGA, Heterogeneous systems, Intel Xeon Phi, nVidia, OpenCL, Package, Performance, Tesla K20, Thesis

May 9, 2017 by hgpu

Efficient Parallel Strategy Improvement for Parity Games

John Fearnley

View

Download (PDF)

Source codes

Tags: Algorithms, Computer science, CUDA, Data Structures and Algorithms, Logic in Computer Science, nVidia, nVidia GeForce GTX 780, Package

May 9, 2017 by hgpu

Fast Sorting Algorithms using AVX-512 on Intel Knights Landing

Berenger Bramas

View

Download (PDF)

Source codes

Tags: Algorithms, Computer science, Intel Xeon Phi, OpenMP, Package, Sorting

May 9, 2017 by hgpu

Acceleration of Deep Learning on FPGA

Huyuan Li

View

Download (PDF)

Tags: CNN, Computer science, Deep learning, FPGA, Neural networks, nVidia, OpenCL, Tesla K40, Thesis

May 9, 2017 by hgpu

UniCoder: Unified Visual-to-Code Generation via Symbolic Rewards and Reference-Guided Code Optimization

CuFuzz: An API-Knowledge-Graph Coverage-Driven Fuzzing Framework for CUDA Libraries

AutoPass: Evidence-Guided LLM Agents for Compiler Performance Tuning

KernelBenchX: A Comprehensive Benchmark for Evaluating LLM-Generated GPU Kernels

CUDA Kernel Fusion Benchmarks

Analyzing the Impact of Kernel Fusion on GPU Tensor Operation Performance: A Systematic Performance Study

IntelliKit: Agent-first tooling for AMD hardware

Kerncap: Automated Kernel Extraction and Isolation for AMD GPUs

DITRON: Distributed Compiler based on Triton for Parallel Systems

DITRON: Distributed Multi-level Tiling Compiler for Parallel Tensor Programs

See all packages

* * *

high performance computing on graphics processing units: hgpu.org

Applications

CLBlast: A Tuned OpenCL BLAS Library

Group Marching Tree: Sampling-Based Approximately Optimal Motion Planning on GPUs

Efficient Parallel Methods for Deep Reinforcement Learning

Real-Time Adaptive Image Compression

Block-Parallel IDA* for GPUs

A Design Methodology for Efficient Implementation of Deconvolutional Neural Networks on an FPGA

DeepMetabolism: A Deep Learning System to Predict Phenotype from Genome Sequencing

Resource-Aware Just-in-Time OpenCL Compiler for Coarse-Grained FPGA Overlays

Towards Enhancing Performance, Programmability, and Portability in Heterogeneous Computing

Efficient Parallel Strategy Improvement for Parity Games

Fast Sorting Algorithms using AVX-512 on Intel Knights Landing

Acceleration of Deep Learning on FPGA

Recent source codes

UniCoder: Unified Visual-to-Code Generation via Symbolic Rewards and Reference-Guided Code Optimization

CuFuzz: An API-Knowledge-Graph Coverage-Driven Fuzzing Framework for CUDA Libraries

AutoPass: Evidence-Guided LLM Agents for Compiler Performance Tuning

Probe-and-Refine Tuning of Repository Guidance for AI Coding Agents

CUDAnalyst (CUDA + Analyst)

CodegenBench

KernelBenchX: A Comprehensive Benchmark for Evaluating LLM-Generated GPU Kernels

CUDA Kernel Fusion Benchmarks

IntelliKit: Agent-first tooling for AMD hardware

DITRON: Distributed Compiler based on Triton for Parallel Systems

Most viewed papers (last 30 days)