high performance computing on graphics processing units: hgpu.org

hgpu.org » Applications » Computer science » Ameliorating Memory Contention of OLAP operators on GPU Processors

Ameliorating Memory Contention of OLAP operators on GPU Processors

Evangelia A. Sitaridi, Kenneth A. Ross

Dept. of Computer Science, Columbia University

Eighth International Workshop on Data Management on New Hardware (DaMoN ’12), 2012

DOI:10.1145/2236584.2236590

@inproceedings{sitaridi2012ameliorating,

title={Ameliorating Memory Contention of OLAP operators on GPU Processors},

author={Sitaridi, E.A. and Ross, K.A.},

booktitle={Proceedings of the Eighth International Workshop on Data Management on New Hardware},

pages={39–47},

year={2012},

organization={ACM}

}

Download (PDF)

View

Source

2715

views

Implementations of database operators on GPU processors have shown dramatic performance improvement compared to multicore-CPU implementations. GPU threads can cooperate using shared memory, which is organized in interleaved banks and is fast only when threads read and modify addresses belonging to distinct memory banks. Therefore, data processing operators implemented on a GPU, in addition to contention caused by popular values, have to deal with a new performance limiting factor: thread serialization when accessing values belonging to the same bank. Here, we define the problem of bank and value conflict optimization for data processing operators using the CUDA platform. To analyze the impact of these two factors on operator performance we use two database operations: foreignkey join and grouped aggregation. We suggest and evaluate techniques for optimizing the data arrangement offline by creating clones of values to reduce overall memory contention. Results indicate that columns used for writes, as grouping columns, need be optimized to fully exploit the maximum bandwidth of shared memory.

Tags: Computer science, CUDA, Databases, nVidia, Tesla C2070

June 8, 2012 by hgpu

No votes yet.

Please wait...

Your response

You must be logged in to post a comment.

* * *

high performance computing on graphics processing units: hgpu.org

Ameliorating Memory Contention of OLAP operators on GPU Processors

Your response

Recent source codes

UniCoder: Unified Visual-to-Code Generation via Symbolic Rewards and Reference-Guided Code Optimization

CuFuzz: An API-Knowledge-Graph Coverage-Driven Fuzzing Framework for CUDA Libraries

AutoPass: Evidence-Guided LLM Agents for Compiler Performance Tuning

Probe-and-Refine Tuning of Repository Guidance for AI Coding Agents

CUDAnalyst (CUDA + Analyst)

CodegenBench

KernelBenchX: A Comprehensive Benchmark for Evaluating LLM-Generated GPU Kernels

CUDA Kernel Fusion Benchmarks

IntelliKit: Agent-first tooling for AMD hardware

DITRON: Distributed Compiler based on Triton for Parallel Systems

Most viewed papers (last 30 days)

Ameliorating Memory Contention of OLAP operators on GPU Processors

Share this:

Your response

Recent source codes

Most viewed papers (last 30 days)