high performance computing on graphics processing units: hgpu.org

hgpu.org » Applications » Computer science » A Time Optimal Parallel Algorithm for the Dynamic Programming on the Hierarchical Memory Machine

A Time Optimal Parallel Algorithm for the Dynamic Programming on the Hierarchical Memory Machine

Koji Nakano

Department of Information Engineering, Hiroshima University, Kagamiyama 1-4-1, Higashi Hiroshima, 739-8527 Japan

International Symposium on Computing and Networking, pp. 86-95, 2014

DOI:10.1109/CANDAR.2014.14

BibTeX

Download (PDF)

View

Source

1992

views

The Hierarchical Memory Machine (HMM) is a theoretical parallel computing model that captures the essence of architecture of CUDA-enabled GPUs. The main contribution of this paper is to present an efficient implementation of the O(n^3)-time dynamic programming algorithm for solving the optimal triangulation problem for a convex n-gon in the HMM. Although the HMM can run a lot of threads in parallel, it is very hard to accelerate computation involving complicated memory access such as the dynamic programming for the optimal triangulation problem. It is often the case that the acceleration rate is limited to the bandwidth w of the global memory for problems involving complicated stride memory access. Quite surprisingly, our implementation of the dynamic programing algorithm for solving the optimal triangulation problem runs O(n^3/w^2) time units using max(wL,w^2l) threads on the HMM with bandwidth w, global memory latency L and shared memory latency l. Hence, this parallel algorithm achieves the acceleration rate of more than w although the dynamic programming algorithm involves complicated stride memory access. Also, we prove that this parallel algorithm is time optimal when L=O(wl).

Tags: Computer science, CUDA, Memory model, nVidia, nVidia GeForce GTX 680

January 13, 2015 by hgpu

No votes yet.

Please wait...

Your response

You must be logged in to post a comment.

* * *

high performance computing on graphics processing units: hgpu.org

A Time Optimal Parallel Algorithm for the Dynamic Programming on the Hierarchical Memory Machine

Your response

Recent source codes

Mutual-Supervised Learning for Sequential-to-Parallel Code Translation

Hardware Compute Partitioning on NVIDIA GPUs for Composable Systems

KISim: Kubernetes Intelligent Scheduling Simulator

Efficient GPU Implementation of Multi-Precision Integer Division

exa-AMD: Exascale Accelerated Materials Discovery

ParEval: A Parallel Code Evaluation Benchmark

FlashSparse: Minimizing Computation Redundancy for Fast Sparse Matrix Multiplications on Tensor Cores

WiLLM: An Open Wireless LLM Communication System

Vcc: the Vulkan Clang Compiler

hpcbench: A set of benchmarking utilities for biomolecular simulation tools

Most viewed papers (last 30 days)

A Time Optimal Parallel Algorithm for the Dynamic Programming on the Hierarchical Memory Machine

Share this:

Your response

Recent source codes

Most viewed papers (last 30 days)