high performance computing on graphics processing units: hgpu.org

Posts

Aug, 13

Speech Recognition on Multi-Core Processors and GPUs

The speed of processors has remained stable over the past few years. The trend may even be towards slower speeds in order to satisfy the ever increasing demands of energy efficiency. This tendency is already apparent in the area of mobile devices. In order to take full advantage of the processing power offered by modern […]

CUDA

Aug, 13

Barrier Invariants: A Shared State Abstraction for the Analysis of Data-Dependent GPU Kernels

Data-dependent GPU kernels, whose data or control flow are dependent on the input of the program, are difficult to verify because they require reasoning about shared state manipulated by many parallel threads. Existing verification techniques for GPU kernels achieve soundness and scalability by using a two-thread reduction and making the contents of the shared state […]

CUDA

•

OpenCL

Aug, 13

Executing Process Networks on Heterogeneous Platforms using OpenCL

Upcoming heterogeneous systems ask for new programming paradigms. Abstracting the underlying hardware architecture is desirable in order to support productive software development. This thesis proposes a design flow and runtime-system for executing process networks on heterogeneous systems using OpenCL. Process networks are a popular model of computation for deterministic parallel programming and OpenCL is a […]

OpenCL

Aug, 13

Achieving Speedup in Aggregate Risk Analysis using Multiple GPUs

Stochastic simulation techniques employed for the analysis of portfolios of insurance/reinsurance risk, often referred to as `Aggregate Risk Analysis’, can benefit from exploiting state-of-the-art high-performance computing platforms. In this paper, parallel methods to speed-up aggregate risk analysis for supporting real-time pricing are explored. An algorithm for analysing aggregate risk is proposed and implemented for multi-core […]

CUDA

Aug, 13

29th ACM Symposium on Applied Computing, SAC 2014

Special track at SAC 2014 Embedded Systems: New Perspectives for Hardware, System Software, and Applications High performance embedded computing has recently become more and more present in devices used in everyday life. A wide variety of applications, from consumer electronics to biomedical systems, require building up powerful yet cheap embedded devices. In this context, embedded […]

Aug, 13

Faraday Discussions 169: Molecular simulations and visualization

Molecular simulations and visualization represent formidable communication and learning tools that could significantly speed up understanding of complex biomolecular systems. Interactive and immersive scientific computing includes aspects from the natural sciences, such as biology, chemistry and material sciences, and from the computational sciences, such as HPC architectures, graphical rendering, human-computer interaction, virtual reality and game […]

Aug, 13

GPU computing in MATLAB webinar (in Russian)

Softline and MathWorks invite you to participate in a free webinar, “GPU Computing in MATLAB”, which will be held on August 29 at 10:00 Moscow time. In this webinar, you will learn how to use the NVIDIA CUDA GPUs in your calculations in MATLAB with minimal code changes. Opportunities to work with MATLAB GPU is […]

Aug, 12

GPU Computing in Discrete Optimization. Part II: Survey Focused on Routing Problems

In many cases there is still a large gap between the performance of current optimization technology and the requirements of real world applications. As in the past, performance will improve through a combination of more powerful solution methods and a general performance increase of computers. These factors are not independent. Due to physical limits, hardware […]

CUDA

Aug, 12

GPU Computing in Discrete Optimization. Part I: Introduction to the GPU

CUDA

Aug, 12

Parallel Simulations for Analysing Portfolios of Catastrophic Event Risk

At the heart of the analytical pipeline of a modern quantitative insurance/reinsurance company is a stochastic simulation technique for portfolio risk analysis and pricing process referred to as Aggregate Analysis. Support for the computation of risk measures including Probable Maximum Loss (PML) and the Tail Value at Risk (TVAR) for a variety of types of […]

CUDA

Aug, 12

Molecular Dynamics Simulation of Multi-Scale Flows on GPUs

The aim of the project described in this report was to integrate the Molecular Fluid Dynamics simulation package OpenFOAM with the molecular-modelling library OpenMM, in order to enable the execution of Molecular Fluid Dynamics simulations on Graphics Processing Units (GPUs). The reason for this integration, rather than a straightforward substitution of OpenFOAM with OpenMM, was […]

CUDA

•

OpenCL

Aug, 12

A Memory Efficient Algorithm for Adaptive Multidimensional Integration with Multiple GPUs

We present a memory efficient algorithm and its implementation for solving multidimensional numerical integration on a cluster of compute nodes with multiple GPU devices per node. The effective use of shared memory is important for improving the performance on GPUs, because of the bandwidth limitation of the global memory. The best known sequential algorithm for […]

CUDA

* * *

high performance computing on graphics processing units: hgpu.org

Posts

Speech Recognition on Multi-Core Processors and GPUs

Barrier Invariants: A Shared State Abstraction for the Analysis of Data-Dependent GPU Kernels

Executing Process Networks on Heterogeneous Platforms using OpenCL

Achieving Speedup in Aggregate Risk Analysis using Multiple GPUs

29th ACM Symposium on Applied Computing, SAC 2014

Faraday Discussions 169: Molecular simulations and visualization

GPU computing in MATLAB webinar (in Russian)

GPU Computing in Discrete Optimization. Part II: Survey Focused on Routing Problems

GPU Computing in Discrete Optimization. Part I: Introduction to the GPU

Parallel Simulations for Analysing Portfolios of Catastrophic Event Risk

Molecular Dynamics Simulation of Multi-Scale Flows on GPUs

A Memory Efficient Algorithm for Adaptive Multidimensional Integration with Multiple GPUs

Recent source codes

Specx: Speculative task-based runtime system

Mutual-Supervised Learning for Sequential-to-Parallel Code Translation

KISim: Kubernetes Intelligent Scheduling Simulator

Hardware Compute Partitioning on NVIDIA GPUs for Composable Systems

Efficient GPU Implementation of Multi-Precision Integer Division

ParEval: A Parallel Code Evaluation Benchmark

FlashSparse: Minimizing Computation Redundancy for Fast Sparse Matrix Multiplications on Tensor Cores

exa-AMD: Exascale Accelerated Materials Discovery

WiLLM: An Open Wireless LLM Communication System

Vcc: the Vulkan Clang Compiler

Most viewed papers (last 30 days)