high performance computing on graphics processing units: hgpu.org

hgpu.org » Pthreads

On a Simplified Approach to Achieve Parallel Performance and Portability Across CPU and GPU Architectures

Nathaniel Morgan, Caleb Yenusah, Adrian Diaz, Daniel Dunning,Jacob Moore,Erin Heilman, Calvin Roth, Evan Lieberman, Steven Walton, Sarah Brown, Daniel Holladay, Marko Knezevic, Gavin Whetstone, Zachary Baker, Robert Robey

View

Download (PDF)

Source codes

Tags: Computer science, CUDA, Fortran, HIP, nVidia, Package, performance portability, Pthreads

November 10, 2024 by hgpu

EASYPAP: a Framework for Learning Parallel Programming

Alice Lasserre, Raymond Namyst, Pierre-André Wacrenier

View

Download (PDF)

Source codes

Tags: Algorithms, AVX, Computer science, Education, MPI, OpenCL, OpenMP, Package, Pthreads

February 16, 2020 by hgpu

Parallelizing Map Projection of Raster Data on Multi-core CPU and GPU Parallel Programming Frameworks

Daniel Chavez

View

Download (PDF)

Tags: Computer science, CUDA, nVidia, nVidia Quadro 1000 M, OpenCL, Pthreads, Thesis

June 28, 2016 by hgpu

Performance Analysis of kNN on large datasets using CUDA & Pthreads

Sriram Kankatala

View

Download (PDF)

Tags: Computer science, CUDA, Data mining, Databases, Machine learning, Nearest neighbour, nVidia, nVidia GeForce GTX 550 Ti, Pattern recognition, Pthreads, Thesis

March 5, 2016 by hgpu

Parallel 3D Fast Wavelet Transform comparison on CPUs and GPUs

Gregorio Bernabe

View

Download (PDF)

Tags: Computer science, CUDA, FFT, nVidia, OpenCL, OpenMP, Pthreads, Tesla C2050

December 31, 2015 by hgpu

Transforming C OpenMP Programs for Verification in CIVL

Michael Rogers

View

Download (PDF)

Tags: Computer science, CUDA, MPI, nVidia, OpenCL, OpenMP, Pthreads, Thesis

December 10, 2015 by hgpu

Climbing Mont Blanc – A Training Site for Energy Efficient Programming on Heterogeneous Multicore Processors

Lasse Natvig, Torbjorn Follan, Simen Stoa, Sindre Magnussen, Antonio Garcia Guirado

View

Download (PDF)

Tags: Computer science, Energy-efficient computing, Heterogeneous systems, OpenCL, OpenMP, Pthreads

November 11, 2015 by hgpu

Performance analysis of GPGPU and CPU On AES Encryption

Akash Kiran Neelap

View

Download (PDF)

Tags: AES, Algorithms, Computer science, CUDA, nVidia, nVidia Quadro K4000, Pthreads, Security, Thesis

July 15, 2015 by hgpu

Multi-GPU Support on Shared Memory System using Directive-based Programming Model

Rengan Xu, Xiaonan Tian, Sunita Chandrasekaran, Barbara Chapman

View

Download (PDF)

Tags: Computer science, CUDA, Heterogeneous systems, nVidia, OpenACC, Pthreads, Tesla K20

February 2, 2015 by hgpu

A Review of CUDA, MapReduce, and Pthreads Parallel Computing Models

Kato Mivule, Benjamin Harvey, Crystal Cobb, Hoda El Sayed

View

Download (PDF)

Tags: Computer science, CUDA, MapReduce, nVidia, Overview, Pthreads, Review, Tesla K20

October 18, 2014 by hgpu

Detection of retransmissions in 10G Ethernet using GPUs

Paula Roquero Fuentes

View

Download (PDF)

Tags: Algorithms, Computer science, CUDA, nVidia, Pthreads, Security, Tesla C2075, Thesis

September 3, 2014 by hgpu

Radiation Modeling Using the Uintah Heterogeneous CPU/GPU Runtime System

Alan Humphrey, Qingyu Meng, Martin Berzins, Todd Harman

View

Download (PDF)

Tags: CUDA, Fluid dynamics, GPU cluster, Heterogeneous systems, MPI, nVidia, Partial differential equations, Pthreads, Task scheduling, Tesla M2090

May 11, 2012 by hgpu

WiLLM: An Open Wireless LLM Communication System

Vcc: the Vulkan Clang Compiler

No More Shading Languages: Compiling C++ to Vulkan Shaders

hpcbench: A set of benchmarking utilities for biomolecular simulation tools

Engineering Supercomputing Platforms for Biomolecular Applications

HPCTransCompile: An AI Compiler Generated Dataset for High-Performance CUDA Transpilation and LLM Preliminary Exploration

* * *

high performance computing on graphics processing units: hgpu.org

On a Simplified Approach to Achieve Parallel Performance and Portability Across CPU and GPU Architectures

EASYPAP: a Framework for Learning Parallel Programming

Parallelizing Map Projection of Raster Data on Multi-core CPU and GPU Parallel Programming Frameworks

Performance Analysis of kNN on large datasets using CUDA & Pthreads

Parallel 3D Fast Wavelet Transform comparison on CPUs and GPUs

Transforming C OpenMP Programs for Verification in CIVL

Climbing Mont Blanc – A Training Site for Energy Efficient Programming on Heterogeneous Multicore Processors

Performance analysis of GPGPU and CPU On AES Encryption

Multi-GPU Support on Shared Memory System using Directive-based Programming Model

A Review of CUDA, MapReduce, and Pthreads Parallel Computing Models

Detection of retransmissions in 10G Ethernet using GPUs

Radiation Modeling Using the Uintah Heterogeneous CPU/GPU Runtime System

Recent source codes

WiLLM: An Open Wireless LLM Communication System

Vcc: the Vulkan Clang Compiler

hpcbench: A set of benchmarking utilities for biomolecular simulation tools

HPCTransCompile: An AI Compiler Generated Dataset for High-Performance CUDA Transpilation and LLM Preliminary Exploration

chemtrain: Training Molecular Dynamics Potentials in JAX

microSYCL: SYCL micro-benchmarks repository

XaaS containers

CASS: Cuda-Amd aSSembly

Cluser of smartphones for edge computing application using TensorFlow

SYCL Container

Most viewed papers (last 30 days)