17972

Posts

Feb, 3

A Collective Knowledge workflow for collaborative research into multi-objective autotuning and machine learning techniques

Developing efficient software and hardware has never been harder whether it is for a tiny IoT device or an Exascale supercomputer. Apart from the ever growing design and optimization complexity, there exist even more fundamental problems such as lack of interdisciplinary knowledge required for effective software/hardware co-design, and a growing technology transfer gap between academia […]
Feb, 3

Efficient SIMD Vectorization for Hashing in OpenCL

Hashing is at the core of many efficient database operators such as hash-based joins and aggregations. Vectorization is a technique that uses Single Instruction Multiple Data (SIMD) instructions to process multiple data elements at once. Applying vectorization to hash tables results in promising speedups for build and probe operations. However, vectorization typically requires intrinsics – […]
Feb, 3

Intel nGraph: An Intermediate Representation, Compiler, and Executor for Deep Learning

The Deep Learning (DL) community sees many novel topologies published each year. Achieving high performance on each new topology remains challenging, as each requires some level of manual effort. This issue is compounded by the proliferation of frameworks and hardware platforms. The current approach, which we call "direct optimization", requires deep changes within each framework […]
Jan, 30

The 4th International Conference on Control, Automation and Robotics (ICCAR), 2018

ICCAR 2018 is a not-to-be-missed opportunity that distills the most current knowledge on a rapidly advancing discipline in one conference. Join key researchers and established professionals in the field of control, automation and robotics as they assess the current state-of-the-art and road-map crucial areas for future research. It will provide a valuable opportunity for researchers, […]
Jan, 30

6th International Conference on Sustainable Development (ICSD), 2018

The International Conference on Sustainable Development is organized by the European Center of Sustainable Development in collaboration with CIT University. The 6th ICSD 2018 is inspired from the critical challenge of human, environmental, and economic sustainability concerning the present and future generations in a global-scale context. The Conference venue is: Roma Eventi, Congress Center, Piazza […]
Jan, 30

3rd International Conference on Computer and Communication Systems (ICCCS), 2018

Published by: Accepted papers will be published in the conference proceedings, which will be submitted for inclusion into IEEE Xplore, submitted for indexing in EI Compendex and Scopus. The conference proceedings of ICCCS 2015 can be checked in IEEE Xplore Papers of ICCCS 2015 have been indexed by Ei Compendex, and Scopus The conference proceeding […]
Jan, 28

SkePU 2: Flexible and Type-Safe Skeleton Programming for Heterogeneous Parallel Systems

In this article we present SkePU 2, the next generation of the SkePU C++ skeleton programming framework for heterogeneous parallel systems. We critically examine the design and limitations of the SkePU 1 programming interface. We present a new, flexible and type-safe, interface for skeleton programming in SkePU 2, and a source-to-source transformation tool which knows […]
Jan, 28

Revisiting Online Autotuning for Sparse-Matrix Vector Multiplication Kernels on Next-Generation Architectures

Sparse-Matrix Vector products (SpMV) are highly irregular computational kernels that can be found in a diverse collection of high-performance science applications. Performance for this important kernel is often highly correlated with the associated matrix sparsity, which, in turn, governs the computational granularity, and therefore, the efficiency of the memory system. In this paper, we propose […]
Jan, 28

Communication Architectures for Scalable GPU-centric Computing Systems

In recent years, power consumption has become the main concern in High Performance Computing (HPC). This has lead to heterogeneous computing systems in which Central Processing Units (CPUs) are supported by accelerators, such as Graphics Processing Units (GPUs). While GPUs used to be seen as slave devices to which the main processor offloads computation, today’s […]
Jan, 28

A Survey on Compiler Autotuning using Machine Learning

Since the mid-1990s, researchers have been trying to use machine-learning based approaches to solve a number of different compiler optimization problems. These techniques primarily enhance the quality of the obtained results and, more importantly, make it feasible to tackle two main compiler optimization problems: optimization selection (choosing which optimizations to apply) and phase-ordering (choosing the […]
Jan, 28

Improving Communication Performance in GPU-Accelerated HPC Clusters

In recent years, GPUs have been adopted in many High-Performance Computing (HPC) clusters due to their massive computational power and energy efficiency. The Message Passing Interface (MPI) is the de-facto standard for parallel programming. Many HPC applications, written in MPI, use parallel processes and multiple GPUs to achieve higher performance and GPU memory capacity. In […]
Jan, 20

Scheduling Parallel Tasks under Multiple Resources: List Scheduling vs. Pack Scheduling

Scheduling in High-Performance Computing (HPC) has been traditionally centered around computing resources (e.g., processors/cores). The ever-growing amount of data produced by modern scientific applications start to drive novel architectures and new computing frameworks to support more efficient data processing, transfer and storage for future HPC systems. This trend towards data-driven computing demands the scheduling solutions […]

* * *

* * *

HGPU group © 2010-2024 hgpu.org

All rights belong to the respective authors

Contact us: