high performance computing on graphics processing units: hgpu.org

Posts

Feb, 28

2014 3rd International Conference on Computer Technology and Science, ICCTS 2014

All papers for the ICCTS 2014 will be published in the IJCEE (ISSN: 1793-8163) as one volume, and will be indexed by Ulrich’s Periodicals Directory, Google Scholar, EBSCO, Engineering & Technology Digital Library, Crossref, ProQuest, DOAJ and EI (INSPEC, IET) and Electronic Journals Library. 2014-04-05 Algorithms Artificial Intelligence Automated Software Engineering Bio-informatics Biomedical Engineering Compilers […]

Feb, 28

Extending a Run-time Resource Management framework to support OpenCL and Heterogeneous Systems

From Mobile to High-Performance Computing (HPC) systems, performance and energy efficiency are becoming always more challenging requirements. In this regard, heterogeneous systems, made by a general-purpose processor and one or more hardware accelerators, are emerging as affordable solutions. However, the effective exploitation of such platforms requires specific programming languages, like for instance OpenCL, and suitable […]

OpenCL

Feb, 28

Expanding the VPE-qGM Environment Towards a Parallel Quantum Simulation of Quantum Processes Using GPUs

Quantum computing proposes quantum algorithms exponentially faster than their classical analogues when executed by a quantum computer. As quantum computers are currently unavailable for general use, one approach for analyzing the behavior and results of such algorithms is the simulation using classical computers. As this simulation is inefficient due to the exponential growth of the […]

CUDA

Feb, 28

A high performance computing for AOM stock trading order matching using GPU

The task of trading orders matching in financial markets is a very challenging task since due to the speed of arriving request. In this paper, the GPUs technology and CUDA programming is explored as a potential technology to accelerate this task. The trading method in Automatic Order Matching (AOM) of Stock Exchange of Thailand (SET) […]

CUDA

Feb, 28

Performance Assessment of A Multi-block Incompressible Navier-Stokes Solver using Directive-based GPU Programming in a Cluster Environment

OpenACC, a directive-based GPU programing standard, is emerging as a promising technology for massively-parallel accelerators, such as General-purpose computing on graphics processing units (GPGPU), Accelerated Processing Unit (APU) and Many Integrated Core Architecture (MIC). The heterogeneous nature of these accelerators call for careful designs of parallel algorithms and data management, which imposes a great hurdle […]

CUDA

Feb, 28

Heterogenous Acceleration for Linear Algebra in Multi-Coprocessor Environments

We present an efficient and scalable programming model for the development of linear algebra in heterogeneous multi-coprocessor environments. The model incorporates some of the current best design and implementation practices for the heterogeneous acceleration of dense linear algebra (DLA). Examples are given as the basis for solving linear systems’ algorithms – the LU, QR, and […]

Feb, 28

Scheduling data flow program in xkaapi: A new affinity based Algorithm for Heterogeneous Architectures

Efficient implementations of parallel applications on heterogeneous hybrid architectures require a careful balance between computations and communications with accelerator devices. Even if most of the communication time can be overlapped by computations, it is essential to reduce the total volume of communicated data. The literature therefore abounds with ad-hoc methods to reach that balance, but […]

CUDA

Feb, 27

2014 3rd International Conference on Knowledge Discovery, ICKD 2014

All papers of ICKD 2014 will be published in the International Journal of Computer Theory and Engineering (IJCTE)(ISSN: 1793-8201), and will be indexed by Electronic Journals Library, EBSCO, Engineering & Technology Digital Library, Google Scholar, INSPEC, Ulrich’s Periodicals Directory, Crossref, ProQuest, WorldCat, and EI (INSPEC, IET). 2014-04-05 T1. Novel Algorithms T2. Association Rules T3. Knowledge […]

Feb, 27

2014 3rd International Conference on Computing and Computer Vision, ICCCV 2014

All papers for the ICCCV 2014 will be published in the Journal of Image and Graphics (JOIG, ISSN: 2301-3699) as one volume, and will be indexed by Ulrich’s Periodicals Directory, Google Scholar, EBSCO, Engineering & Technology Digital Library and Electronic Journals Digital Library. 2014-04-01 Machine Vision, Image Processing, and Pattern Analysis Imaging Sensors Color and […]

Feb, 27

Parallel dual tree traversal on multi-core and many-core architectures for astrophysical N-body simulations

In astrophysical N-body simulations, Dehnen’s algorithm, implemented in the serial falcON code and based on a dual tree traversal, is faster than serial Barnes-Hut tree-codes, but outperformed by parallel CPU and GPU tree-codes. In this paper, we present a parallel dual tree traversal, implemented in the pfalcON code, targeting multi-core CPUs and manycore architectures (Xeon […]

CUDA

Feb, 27

Face Recognition Using OpenCL

Face recognition is the biometric identification of human’s face and matching the image against a library of known faces. The algorithm used to simulate the above is Eigen faces algorithm. The software which is been proposed to implement is Open CL. Open CL (Open Computing Language) is an open standard for general purpose parallel programming […]

OpenCL

Feb, 27

G-Heart: A GPU-based System for Electrophysiological Simulation and Multi-modality Cardiac Visualization

Cardiac electrophysiological simulation and multi-modality visualization are computationally intensive and valuable in studying the structure, mechanism, and dynamics of heart. The existing multi-CPU based approaches can reduce the calculation time, but suffer from the hardware and communication cost problems and are inefficient for 3D data visualization. Compared with multi-CPU, the highly parallel and multi-core properties […]

CUDA

hpcbench: A set of benchmarking utilities for biomolecular simulation tools

Engineering Supercomputing Platforms for Biomolecular Applications

high performance computing on graphics processing units: hgpu.org

Posts

2014 3rd International Conference on Computer Technology and Science, ICCTS 2014

Extending a Run-time Resource Management framework to support OpenCL and Heterogeneous Systems

Expanding the VPE-qGM Environment Towards a Parallel Quantum Simulation of Quantum Processes Using GPUs

A high performance computing for AOM stock trading order matching using GPU

Performance Assessment of A Multi-block Incompressible Navier-Stokes Solver using Directive-based GPU Programming in a Cluster Environment

Heterogenous Acceleration for Linear Algebra in Multi-Coprocessor Environments

Scheduling data flow program in xkaapi: A new affinity based Algorithm for Heterogeneous Architectures

2014 3rd International Conference on Knowledge Discovery, ICKD 2014

2014 3rd International Conference on Computing and Computer Vision, ICCCV 2014

Parallel dual tree traversal on multi-core and many-core architectures for astrophysical N-body simulations

Face Recognition Using OpenCL

G-Heart: A GPU-based System for Electrophysiological Simulation and Multi-modality Cardiac Visualization

Recent source codes

hpcbench: A set of benchmarking utilities for biomolecular simulation tools

HPCTransCompile: An AI Compiler Generated Dataset for High-Performance CUDA Transpilation and LLM Preliminary Exploration

chemtrain: Training Molecular Dynamics Potentials in JAX

microSYCL: SYCL micro-benchmarks repository

XaaS containers

CASS: Cuda-Amd aSSembly

Cluser of smartphones for edge computing application using TensorFlow

SYCL Container

Efficient Graph Embedding at Scale: Optimizing CPU-GPU-SSD Integration

Can Large Language Models Predict Parallel Code Performance?

Most viewed papers (last 30 days)