high performance computing on graphics processing units: hgpu.org

Posts

Apr, 4

An efficient GPU acceptance-rejection algorithm for the selection of the next reaction to occur for Stochastic Simulation Algorithms

MOTIVATION: The Stochastic Simulation Algorithm (SSA) has largely diffused in the field of systems biology. This approach needs many realizations for establishing statistical results on the system under study. It is very computationnally demanding, and with the advent of large models this burden is increasing. Hence parallel implementation of SSA are needed to address these […]

CUDA

Apr, 4

Irradiation Instability at the Inner Edges of Accretion Disks

An instability can potentially operate in highly irradiated disks where the disk sharply transitions from being radially transparent to opaque (the ‘transition region’). Such conditions may exist at the inner edges of transitional disks around T Tauri stars and accretion disks around AGNs. We derive the criterion for this instability, which we term the ‘irradiation […]

CUDA

Apr, 4

Anomalous metastability in a temperature-driven transition

Langer theory of metastability provides a description of the lifetime and properties of the metastable phase of the Ising model field-driven transition, describing the magnetic field-driven transition in ferromagnets and the chemical potential-driven transition of fluids. An immediate further step is to apply it to the study of a transition driven by the temperature, as […]

CUDA

Apr, 4

Petascale computations for Large-scale Atomic and Molecular collisions

Petaflop architectures are currently being utilized efficiently to perform large scale computations in Atomic, Molecular and Optical Collisions. We solve the Schroedinger or Dirac equation for the appropriate collision problem using the R-matrix or R-matrix with pseudo-states approach. We briefly outline the parallel methodology used and implemented for the current suite of Breit-Pauli and DARC […]

CUDA

Apr, 4

2014 5th International Conference on Computer and Computational Intelligence, ICCCI 2014

2014-09-01 Selected submission paper will be recommended to publish into one of the journals below: Journal of Computers (JCP, ISSN: 1796-203X) –EI Compendex; SCOPUS; Journal of Software (JSW, ISSN: 1796-217X,) –EI Compendex; SCOPUS; International Journal of Machine Learning and Computing (IJMLC, ISSN: 2010-3700; DOI: 10.7763/IJMLC) –Engineering & Technology Digital Library, Google Scholar, Crossref, ProQuest, Electronic […]

Apr, 3

Experiments with Massively Parallel Matrix Multiplication

This paper presents initial experiments in implementing two notable matrix multiplication algorithms – the DNS algorithm and Cannon’s algorithm – using NVIDIA’s general-purpose graphics processing units (GPGPUs) and CUDA development platform. We demonstrate that these implementations are comparable with traditional methods in terms of computational expense and may scale better than traditional techniques.

CUDA

Apr, 3

Multicore and GPU Algorithms for Nussinov RNA Folding

We develop cache efficient, multicore, and GPU algorithms for RNA folding using Nussinov’s equations. Our cache efficient algorithm provides a speedup between 1.6 and 3.0 relative to a naive straightforward single core code. The multicore version of the cache efficient single core algorithm provides a speedup, relative to the naive single core algorithm, between 7.5 […]

CUDA

Apr, 2

Pairwise Sequence Alignment for Very Long Sequences on GPUs

We develop novel single-GPU parallelizations of the Smith-Waterman algorithm for pairwise sequence alignment. Our algorithms, which are suitable for the alignment of a single pair of very long sequences, can be used to determine the alignment score as well as the actual alignment. Experimental results demonstrate an order of magnitude reduction in run time relative […]

CUDA

Apr, 2

Acceleration of Diagrammatic Determinantal Quantum Monte Carlo Calculations using GPUs

Diagrammatic Determinantal Quantum Monte Carlo (DDQMC) algorithms are used to solve quantum impurity models such as the Anderson model. The calculation of acceptance rates and observables during the Monte Carlo walk involves linear algebra operations whose computational expense increases with decreasing temperature. Thus, the lower boundary of the treatable temperature range is limited by the […]

CUDA

Apr, 2

A Novel Open Source Morphology Using GPU Processing With LTU-CUDA

A mathematical morphology is used as a tool for extracting image components that are useful in the representation and description of region shape. The mathematical morphology operations of dilation, erosion, opening, and closing are important building blocks of many other image processing algorithms. The data parallel programming provides an opportunity for performance acceleration using highly […]

CUDA

Apr, 2

2014 5th International Conference on Networking and Information Technology, ICNIT 2014

2014-09-20 Accepted papers will be published in the one of the following Journals with ISSN. International Journal of Computer and Communication Engineering (IJCCE, ISSN: 2010-3743) Journal of Advances in Computer Networks (JACN, ISSN: 1793-8244) Journal of Communications (ISSN: 1796-2021) 3G & 4G Mobile Communication Services Agents and Multi-Agents systems for ICT Integrated Circuits for Communications […]

Apr, 1

GPU Based Performance Acceleration of Radar Imaging Algorithms

We consider the performance acceleration of the conventional Time Domain Backprojection and Kirchhoff Migration algorithms for imaging concealed targets. The Compute Unified Device Architecture (CUDA) and Open Computing Language (OpenCL) are used here for accelerating these algorithms on Graphics Processing Units (GPUs). Data generated by means of analytical methods, simulation and experiment are used for […]

CUDA

•

OpenCL

high performance computing on graphics processing units: hgpu.org

Posts

An efficient GPU acceptance-rejection algorithm for the selection of the next reaction to occur for Stochastic Simulation Algorithms

Irradiation Instability at the Inner Edges of Accretion Disks

Anomalous metastability in a temperature-driven transition

Petascale computations for Large-scale Atomic and Molecular collisions

2014 5th International Conference on Computer and Computational Intelligence, ICCCI 2014

Experiments with Massively Parallel Matrix Multiplication

Multicore and GPU Algorithms for Nussinov RNA Folding

Pairwise Sequence Alignment for Very Long Sequences on GPUs

Acceleration of Diagrammatic Determinantal Quantum Monte Carlo Calculations using GPUs

A Novel Open Source Morphology Using GPU Processing With LTU-CUDA

2014 5th International Conference on Networking and Information Technology, ICNIT 2014

GPU Based Performance Acceleration of Radar Imaging Algorithms

Recent source codes

HPCTransCompile: An AI Compiler Generated Dataset for High-Performance CUDA Transpilation and LLM Preliminary Exploration

chemtrain: Training Molecular Dynamics Potentials in JAX

microSYCL: SYCL micro-benchmarks repository

XaaS containers

SYCL Container

CASS: Cuda-Amd aSSembly

Cluser of smartphones for edge computing application using TensorFlow

CFAL-bench

Efficient Graph Embedding at Scale: Optimizing CPU-GPU-SSD Integration

Can Large Language Models Predict Parallel Code Performance?

Most viewed papers (last 30 days)