6709

Posts

Dec, 19

Optimization of mapped functions sequences using fusions on GPU

When implementing a function mapping on the contemporary GPU, several contradictory performance factors have to be balanced. Previously a decomposition-fusion scheme was devised to guide such an implementation and this work is here further elaborated. To ease this process, an automatic source-to-source compiler is presented, while the main subject of this thesis are the core […]
Dec, 19

GPU-based Implementation of the Variational Path Integral Method

Any system in the world constitutes particles like electrons. To analyze the behaviors of these systems the behavior of these particles must be predicted. The ground state energy of a molecule is the most important information about a molecule and can calculate by solving the Schrodinger equation. But as the number of atoms increase, the […]
Dec, 19

Towards Automatic C Programs Optimization and Parallelization using the PIPS-PoCC Integration

This paper explains how the PIPS source-to-source compilation framework integrates the Polyhedral Compiler Collection (PoCC) as one of PIPS many program transformations. The integration between PIPS and PoCC extracts automatically the static control parts of the source code, which can be optimized independently by PoCC and then reintegrates them transparently in the user source code. […]
Dec, 19

Experiences Developing the OpenUH Compiler and Runtime Infrastructure

The OpenUH compiler is a branch of the open source Open64 compiler suite for C, C++, Fortran 95/2003, with support for a variety of targets including x86_64, IA-64, and IA-32. For the past several years, we have used OpenUH to conduct research in parallel programming models and their implementation, static and dynamic analysis of parallel […]
Dec, 19

Acceleration of grammatical evolution using graphics processing units: computational intelligence on consumer games and graphics hardware

Several papers show that symbolic regression is suitable for data analysis and prediction in financial markets. Grammatical Evolution (GE), a grammar-based form of Genetic Programming (GP), has been successfully applied in solving various tasks including symbolic regression. However, often the computational effort to calculate the fitness of a solution in GP can limit the area […]
Dec, 19

Parallel programming with inductive synthesis

We show that program synthesis can generate GPU algorithms as well as their optimized implementations. Using the scan kernel as a case study, we describe our evolving synthesis techniques. Relying on our synthesizer, we can parallelize a serial problem by transforming it into a scan operation, synthesize a SIMD scan algorithm, and optimize it to […]
Dec, 18

A Translation Framework for Executing the Sequential Binary Code on CPU/GPU Based Architectures

The method of using DBT (dynamic binary translation) to execute the source ISAs binary code on target platforms has been perplexed by low overhead for many years. GPU as a many-core processor has tremendous computational power. Employing GPU as a coprocessor to parallel execute the hot spot of binary code hold a great promise of […]
Dec, 18

Automatic code generation and tuning for stencil kernels on modern shared memory architectures

In this paper, we present Patus, a code generation and auto-tuning framework for stencil computations targeted at multi- and manycore processors, such as multicore CPUs and graphics processing units. Patus, which stands for "Parallel Autotuned Stencils," generates a compute kernel from a specification of the stencil operation and a strategy which describes the parallelization and […]
Dec, 18

GPU-based simulation of 3D blood flow in abdominal aorta using OpenFOAM

The simulation of blood flow in the cardiac system has the potential to become an attractive diagnostic tool for many cardiovascular diseases, such as in the case of aneurysm. This potential could be reached if the simulations were to be completed in hours rather than days and without resorting to the use of expensive supercomputers. […]
Dec, 18

Collision for 75-step SHA-1: Intensive Parallelization with GPU

We present a brief report on the collision search for the reduced SHA-1. With a few improvements to our previous work, directed at efficient parallelization on a GPU cluster, we managed to construct a new collision for 75-step reduced SHA-1 hash function.
Dec, 18

Leveraging Parallelism with CUDA and OpenCL

Graphics processing units (GPUs), originally designed for computing and manipulating pixels, have become general-purpose processors capable of executing in excess of trillion calculations per second. Taking advantage of GPU’s compute power and commodity popularity, the field of computing systems is exhibiting a trend toward heterogeneous platforms consisting of a central processor integrated with graphics hardware. […]
Dec, 18

Efficient Computational Methods for Uncertainty Quantification of Large Systems

The quest to design environment-friendly and sustainable engineering systems has witnessed more and more fervent efforts in recent years. With the growth of affordable large-capacity computing resources, predictive, science-based computational models have become instrumental in this pursuit. The present work develops efficient computational methods for the uncertainty analysis of large dynamical and mechanical systems with […]

* * *

* * *

HGPU group © 2010-2025 hgpu.org

All rights belong to the respective authors

Contact us:

contact@hpgu.org