high performance computing on graphics processing units: hgpu.org

hgpu.org » Applications » Computer science » A load balance multi-scheduling model for OpenCL kernel tasks in an integrated cluster

A load balance multi-scheduling model for OpenCL kernel tasks in an integrated cluster

Usman Ahmed, Jerry Chun-Wei Lin, Gautam Srivastava, Muhammad Aleem

Department of Computer Science, Electrical Engineering and Mathematical Sciences, Western Norway University of Applied Sciences, Bergen, Norway

Soft Computing, 2020

DOI:10.1007/s00500-020-05152-8

BibTeX

Download (PDF)

View

Source

1452

views

Nowadays, embedded systems are comprised of heterogeneous multi-core architectures, i.e., CPUs and GPUs. If the application is mapped to an appropriate processing core, then these architectures provide many performance benefits to applications. Typically, programmers map sequential applications to CPU and parallel applications to GPU. The task mapping becomes challenging because of the usage of evolving and complex CPU- and GPU-based architectures. This paper presents an approach to map the OpenCL application to heterogeneous multi-core architecture by determining the application suitability and processing capability. The classification is achieved by developing a machine learning-based device suitability classifier that predicts which processor has the highest computational compatibility to run OpenCL applications. In this paper, 20 distinct features are proposed that are extracted by using the developed LLVM-based static analyzer. In order to select the best subset of features, feature selection is performed by using both correlation analysis and the feature importance method. For the class imbalance problem, we use and compare synthetic minority over-sampling method with and without feature selection. Instead of hand-tuning the machine learning classifier, we use the tree-based pipeline optimization method to select the best classifier and its hyper-parameter. We then compare the optimized selected method with traditional algorithms, i.e., random forest, decision tree, Naïve Bayes and KNN. We apply our novel approach on extensively used OpenCL benchmarks, i.e., AMD and Polybench. The dataset contains 653 training and 277 testing applications. We test the classification results using four performance metrics, i.e., F-measure, precision, recall and R^2. The optimized and reduced feature subset model achieved a high F-measure of 0.91 and R^2 of 0.76. The proposed framework automatically distributes the workload based on the application requirement and processor compatibility.

Tags: Computer science, Heterogeneous systems, Machine learning, nVidia, nVidia GeForce GT 740 M, nVidia GeForce GTX 760, OpenCL

July 19, 2020 by hgpu

No votes yet.

Please wait...

hpcbench: A set of benchmarking utilities for biomolecular simulation tools

Engineering Supercomputing Platforms for Biomolecular Applications

high performance computing on graphics processing units: hgpu.org

A load balance multi-scheduling model for OpenCL kernel tasks in an integrated cluster

Recent source codes

hpcbench: A set of benchmarking utilities for biomolecular simulation tools

HPCTransCompile: An AI Compiler Generated Dataset for High-Performance CUDA Transpilation and LLM Preliminary Exploration

chemtrain: Training Molecular Dynamics Potentials in JAX

microSYCL: SYCL micro-benchmarks repository

XaaS containers

CASS: Cuda-Amd aSSembly

Cluser of smartphones for edge computing application using TensorFlow

SYCL Container

CFAL-bench

Efficient Graph Embedding at Scale: Optimizing CPU-GPU-SSD Integration

Most viewed papers (last 30 days)

A load balance multi-scheduling model for OpenCL kernel tasks in an integrated cluster

Share this:

Recent source codes

Most viewed papers (last 30 days)