high performance computing on graphics processing units: hgpu.org

hgpu.org » Applications » Computer science » Towards Understanding and Mitigating Memory-Access Challenges in Computing Systems

Towards Understanding and Mitigating Memory-Access Challenges in Computing Systems

Mohammad K. Dashti

The University of British Columbia

The University of British Columbia, 2022

DOI:10.14288/1.0417420

BibTeX

Download (PDF)

View

Source

Source codes

Package:

PIM-JPEG: JPEG autoscaling in memory

1107

views

In an era of hardware diversity, the management of applications’ allocated memory is a complex task that can have significant performance repercussions. Non-uniformity in the memory hierarchy, along with heterogeneity and asymmetry of chip designs, make the costs of memory accesses unpredictable if the allocated memory is not managed carefully. Poor memory allocation and placement on symmetric, non-uniform memory access (NUMA) server systems can cause interconnect link congestion and memory controller contention, which can drastically impact performance. Furthermore, asymmetric systems consisting of CPU and GPU cores suffer similarly depending on how memory is allocated and what types of cores are accessing it. Furthermore, modern chip designs are integrating compute units into the memory modules to bring compute closer to memory and eliminate the high costs of transferring data. Hence, accessing memory efficiently is becoming increasingly challenging as systems evolve toward heterogeneity. Our contribution is a detailed analysis and insight into software and hardware memory management techniques on modern symmetric server-class and asymmetric heterogeneous processors consisting of integrated CPU and GPU cores, and to recently commercially available Processing In Memory (PIM) systems. In particular, we examine three types of systems that are affected by memory access bottlenecks. First, we look at NUMA systems, then integrated CPU-GPU systems, and finally PIM systems. We analyze these systems and suggest possible solutions to mitigate memory access issues. For NUMA systems, we propose a holistic memory management algorithm that intelligently distributes memory pages to reduce congestion and improve performance. Then, we provide a detailed analysis of memory management methods on integrated CPU-GPU systems focusing on performance and functionality trade-offs. Our goal is to expose the performance impact of memory management techniques and assess the viability and advantage of running applications with complex behaviors on such integrated CPU-GPU systems. Finally, we examine PIM systems with a case study of image decoding, which is a critical stage for many deep learning applications. We show that the extreme compute scalability of PIM systems can be utilized to accelerate image decoding with performance potential that can surpass CPUs and GPUs.

Tags: Computer science, CUDA, Deep learning, Heterogeneous systems, Memory model, nVidia, nVidia GeForce RTX 2070, nVidia Jetson TK1, OpenCL, Package, Thesis

August 28, 2022 by hgpu

No votes yet.

Please wait...

hpcbench: A set of benchmarking utilities for biomolecular simulation tools

Engineering Supercomputing Platforms for Biomolecular Applications

high performance computing on graphics processing units: hgpu.org

Towards Understanding and Mitigating Memory-Access Challenges in Computing Systems

Package:

Recent source codes

hpcbench: A set of benchmarking utilities for biomolecular simulation tools

HPCTransCompile: An AI Compiler Generated Dataset for High-Performance CUDA Transpilation and LLM Preliminary Exploration

chemtrain: Training Molecular Dynamics Potentials in JAX

microSYCL: SYCL micro-benchmarks repository

XaaS containers

CASS: Cuda-Amd aSSembly

Cluser of smartphones for edge computing application using TensorFlow

SYCL Container

Efficient Graph Embedding at Scale: Optimizing CPU-GPU-SSD Integration

Can Large Language Models Predict Parallel Code Performance?

Most viewed papers (last 30 days)

Towards Understanding and Mitigating Memory-Access Challenges in Computing Systems

Package:

Share this:

Recent source codes

Most viewed papers (last 30 days)