high performance computing on graphics processing units: hgpu.org

hgpu.org » Applications » Computer science » Computer vision » Speculative Parallel Evaluation Of Classification Trees On GPGPU Compute Engines

Speculative Parallel Evaluation Of Classification Trees On GPGPU Compute Engines

Jason Spencer

School of Computing and Digital Media, DePaul University, Chicago, IL, USA

arXiv:1111.1373v1 [cs.DC] (6 Nov 2011)

BibTeX

Download (PDF)

View

Source

2118

views

We examine the problem of optimizing classification tree evaluation for on-line and real-time applications by using GPUs. Looking at trees with continuous attributes often used in image segmentation, we first put the existing algorithms for serial and data-parallel evaluation on solid footings. We then introduce a speculative parallel algorithm designed for single instruction, multiple data (SIMD) architectures commonly found in GPUs. A theoretical analysis shows how the run times of data and speculative decompositions compare assuming independent processors. To compare the algorithms in the SIMD environment, we implement both on a CUDA 2.0 architecture machine and compare timings to a serial CPU implementation. Various optimizations and their effects are discussed, and results are given for all algorithms. Our specific tests show a speculative algorithm improves run time by 25% compared to a data decomposition.

Tags: Computer science, Computer vision, CUDA, nVidia, nVidia Quadro FX 2000, Optimization, Pattern recognition

November 8, 2011 by hgpu

No votes yet.

Please wait...

high performance computing on graphics processing units: hgpu.org

Speculative Parallel Evaluation Of Classification Trees On GPGPU Compute Engines

Recent source codes

HPCTransCompile: An AI Compiler Generated Dataset for High-Performance CUDA Transpilation and LLM Preliminary Exploration

chemtrain: Training Molecular Dynamics Potentials in JAX

microSYCL: SYCL micro-benchmarks repository

XaaS containers

SYCL Container

CASS: Cuda-Amd aSSembly

Cluser of smartphones for edge computing application using TensorFlow

CFAL-bench

Efficient Graph Embedding at Scale: Optimizing CPU-GPU-SSD Integration

Can Large Language Models Predict Parallel Code Performance?

Most viewed papers (last 30 days)

Speculative Parallel Evaluation Of Classification Trees On GPGPU Compute Engines

Share this:

Recent source codes

Most viewed papers (last 30 days)