high performance computing on graphics processing units: hgpu.org

Posts

Nov, 25

Ageing at the Spin-Glass/Ferromagnet Transition: Monte Carlo Simulation using GPUs

We study the the non-equilibrium ageing behaviour of the +/-J Edwards-Anderson model in three dimensions for samples of size up to N=128^3 and for up to 10^8 Monte Carlo sweeps. In particular we are interested in the change of the ageing when crossing from the spin-glass phase to the ferromagnetic phase. The necessary long simulation […]

CUDA

Nov, 25

Scalable Verification Techniques for Data-Parallel Programs

This thesis is about scalable formal verification techniques for software. A verification technique is scalable if it is able to scale to reasoning about real (rather than synthetic or toy) programs. Scalable verification techniques are essential for practical program verifiers. In this work, we consider three key characteristics of scalability: precision, performance and automation. We […]

CUDA

•

OpenCL

Nov, 21

4th International Conference on Computer Technology and Science, ICCTS 2015

Publication: Selected submission paper will be recommended to publish into one of the journals below: *IJCTE: Abstracting/Indexing: Index Copernicus,Electronic Journals Library, EBSCO, Engineering & Technology Digital Library, Google Scholar, Ulrich’s Periodicals Directory, Crossref, ProQuest, WorldCat, and EI (INSPEC, IET), Cabell’s Directories. *IJCCE: Abstracting/Indexing: EI (INSPEC, IET), Google Scholar, Engineering & Technology Digital Library, ProQuest, and […]

Nov, 20

A GPU-based framework for efficient image processing

This thesis tries to answer how to design a framework for image processing on the GPU, supporting the common environments OpenGL GLSL, OpenCL and CUDA. An generalized view of GPU image processing is presented. The framework is called gpuip and is implemented in C++ but also wrapped with Python-bindings. The framework is cross-platform and works […]

CUDA

•

OpenCL

•

OpenGL

Nov, 20

Using CUDA architecture for computer simulations of thermomechanical phenomena

This paper presents a simulation of the casting solidification process performed on graphics processors compatible with nVidia CUDA architecture. Indispensable for the parallel implementation of a computer simulation of the solidification process, it was necessary to modify the numerical model. The new approach shown in this paper allows the process of matrix building to be […]

CUDA

Nov, 20

Automatic Performance Tuning of Pipeline Patterns for Heterogeneous Parallel Architectures

Heterogeneous parallel architectures combining conventional multicore CPUs with GPUs and other types of accelerators promise significant performance gains compared to homogeneous systems. However, exploiting the full potential of such systems is becoming more and more challenging often forcing programmers to combine different programming models and parallelization strategies. A promising approach to coping with the increased […]

OpenCL

Nov, 20

CL2QCD – Lattice QCD based on OpenCL

We present the Lattice QCD application CL2QCD, which is based on OpenCL and can be utilized to run on Graphic Processing Units as well as on common CPUs. We focus on implementation details as well as performance results of selected features. CL2QCD has been successfully applied in LQCD studies at finite temperature and density and […]

OpenCL

Nov, 20

Using Graphics Processing Units to solve the classical N-body problem in physics and astrophysics

Graphics Processing Units (GPUs) can speed up the numerical solution of various problems in astrophysics including the dynamical evolution of stellar systems; the performance gain can be more than a factor 100 compared to using a Central Processing Unit only. In this work I describe some strategies to speed up the classical N-body problem using […]

CUDA

•

OpenCL

Nov, 20

International Conference on Engineering Mathematics and Physics, ICEMP 2015

Publication: Submitted papers can be selected and published into one of the following Journals: Advanced Materials Research (ISSN: 1022-6680) Indexed by Elsevier: SCOPUS and Ei Compendex (CPX), Cambridge Scientific Abstracts (CSA), Chemical Abstracts (CA), Google and Google Scholar, ISI (ISTP, CPCI, Web of Science), Institution of Electrical Engineers (IEE), etc. International Journal of Applied Physics […]

Nov, 20

OPNET: An Integrated Design Paradigm for Simulations

In recent years, a lot of progress has been made in the field of networks and communications; and also in design of simulators. In this paper, we survey and review prominent fields where OPNET has been applied and compare it with other existing simulators. Our work helps beginners and researchers alike in estimating the useful […]

Nov, 20

A Study of Successive Over-relaxation Method Parallelization Over Modern HPC Languages

Successive over-relaxation (SOR) is a computationally intensive, yet extremely important iterative solver for solving linear systems. Due to recent trends of exponential growth in the amount of data generated and increasing problem sizes, serial platforms have proved to be insufficient in providing the required computational power. In this paper, we present parallel implementations of red-black […]

Nov, 19

FPGA: An Efficient And Promising Platform For Real-Time Image Processing Applications

Digital image processing(DIP) is an ever growing area with a variety of applications including medicine, video surveillance, and many more. To implement the upcoming sophisticated DIP algorithms and to process the large amount of data captured from sources such as satellites or medical instruments, intelligent high speed real-time systems have become imperative. Image processing algorithms […]

* * *

high performance computing on graphics processing units: hgpu.org

Posts

Ageing at the Spin-Glass/Ferromagnet Transition: Monte Carlo Simulation using GPUs

Scalable Verification Techniques for Data-Parallel Programs

4th International Conference on Computer Technology and Science, ICCTS 2015

A GPU-based framework for efficient image processing

Using CUDA architecture for computer simulations of thermomechanical phenomena

Automatic Performance Tuning of Pipeline Patterns for Heterogeneous Parallel Architectures

CL2QCD – Lattice QCD based on OpenCL

Using Graphics Processing Units to solve the classical N-body problem in physics and astrophysics

International Conference on Engineering Mathematics and Physics, ICEMP 2015

OPNET: An Integrated Design Paradigm for Simulations

A Study of Successive Over-relaxation Method Parallelization Over Modern HPC Languages

FPGA: An Efficient And Promising Platform For Real-Time Image Processing Applications

Recent source codes

Specx: Speculative task-based runtime system

Mutual-Supervised Learning for Sequential-to-Parallel Code Translation

KISim: Kubernetes Intelligent Scheduling Simulator

Hardware Compute Partitioning on NVIDIA GPUs for Composable Systems

Efficient GPU Implementation of Multi-Precision Integer Division

ParEval: A Parallel Code Evaluation Benchmark

FlashSparse: Minimizing Computation Redundancy for Fast Sparse Matrix Multiplications on Tensor Cores

exa-AMD: Exascale Accelerated Materials Discovery

WiLLM: An Open Wireless LLM Communication System

Vcc: the Vulkan Clang Compiler

Most viewed papers (last 30 days)