18345

Posts

Jun, 28

Introducing Parallelism to the Ranges TS

The current interface provided by the C++17 parallel algorithms poses some limitations with respect to parallel data access and heterogeneous systems, such as personal computers and server nodes with GPUs, smartphones, and embedded System on a Chip chipsets. In this paper, we present a summary of why we believe the Ranges TS solves these problems, […]
Jun, 28

Computing dynamics of thin films via large scale GPU-based simulations

We present the results of large scale simulations of 4th order nonlinear partial differential equations of dif- fusion type that are typically encountered when modeling dynamics of thin fluid films on substrates. The simulations are based on the alternate direction implicit (ADI) method, with the main part of the compu- tational work carried out in […]
Jun, 28

Analyzing Memory Accesses for Performance and Correctness of Parallel Programs

The demand for large compute capabilities in scientific computing led to wide use and acceptance of highly-parallel computer architectures during the last decade. This trend is manifested in the TOP500, listing the fastest supercomputer of the world, in which about 40 % of the performance share results from accelerator-based systems. Programming for these architectures in […]
Jun, 28

Improving tasks throughput on accelerators using OpenCL command concurrency

A heterogeneous architecture composed by a host and an accelerator must frequently deal with situations where several independent tasks are available to be offloaded onto the accelerator. These tasks can be generated by concurrent applications executing in the host or, in case the host is a node of a computer cluster, by applications running on […]
Jun, 28

Migrating from OpenGL ES to Vulkan

This document outlines the key differences between OpenGL ES and the new Vulkan, and why a developer would want to migrate to Vulkan. Vulkan is a new low level graphics API that allows the developer to get very low level with an almost console-like API. This allows for greater control, performance and transparency. This is […]
Jun, 24

Research on the simulation of PF-LBM model based on MPI+CUDA mixed granularity parallel

A microstructure numerical model is an intensive computational problem, for which the simulation time is too long and the simulation scale is too small. To solve these two problems, in this article, we use MPI+CUDA hybrid particle heterogeneous parallel computing to implement the dendrite growth simulation of a PF-LBM phase-field 3D model. Message Passing Interface […]
Jun, 24

Go game move prediction using convolutional neural network

The purpose of this paper is to introduce the use of convolutional neural network for prediction of the next appropriate move in the Go game. The paper contains description of the crucial Go game rules, neural networks theory, description of implemented programs and final evaluation of the trained neural networks. The programs were implemented with […]
Jun, 24

Strategies for the Heterogeneous Execution of Large-Scale Simulations on Hybrid Supercomputers

Massively-parallel devices of various architectures are being adopted by the newest supercomputers to overcome the actual power constraint in the context of the exascale challenge. This progress leads to an increasing hybridisation of HPC systems and makes the design of computing applications a rather complex problem. Therefore, the software efficiency and portability are of crucial […]
Jun, 24

Synthesis of GPU Programs from High-Level Models

Modern graphics processing units (GPUs) provide high-performance general purpose computation abilities. They have massive parallel architectures that are suitable for executing parallel algorithms and operations. They are also throughput-oriented devices that are optimized to achieve high throughput for stream processing. Designing efficient GPU programs is a notoriously difficult task. The ForSyDe methodology is suitable to […]
Jun, 24

DeepSmith: Compiler Fuzzing through Deep Learning

Random program generation – fuzzing – is an effective technique for discovering bugs in compilers but successful fuzzers require extensive development effort for every language supported by the compiler, and often leave parts of the language space untested. We introduce DeepSmith, a novel machine learning approach to accelerating compiler validation through the inference of generative […]
Jun, 22

The 3rd International Conference on Computer Systems and Communication Technology (ICCSCT), 2018

It will bring together the researchers to exchange their research results and address open issues in: Computer Systems and Engineering Artificial Intelligence Computer Modeling Internet of Things (IoT) Decision Support System and Models Software Engineering Computer Graphics and Multimedia Image Processing System software support for multimedia Communication Technology Network and Wireless Communication VLSI Circuits and […]
Jun, 22

The 4th International Conference on Communication and Information Processing (ICCIP), 2018

Keynote speakers: Prof. Jalel Ben-Othman – University of Paris 13, France. Prof. Herwig Unger – University of Hagen, Germany. Published by: The accepted paper of ICCIP 2018 will be published into ICCIP 2018 Conference Proceedings, which will be published in the International Conference Proceedings Series by ACM and archived in the ACM Digital Library. The […]

* * *

* * *

HGPU group © 2010-2019 hgpu.org

All rights belong to the respective authors

Contact us: