7838

Posts

Jun, 15

Energy Efficiency Analysis of GPUs

In the last few years, Graphics Processing Units (GPUs) have become a great tool for massively parallel computing. GPUs are specifically designed for throughput and face several design challenges, specially what is known as the Power and Memory Walls. In these devices, available resources should be used to enhance performance and throughput, as the performance […]
Jun, 14

SAGA: SystemC Acceleration on GPU Architectures

SystemC is a widespread language for HW/SW system simulation and design exploration, and thus a key development platform in embedded system design. However, the growing complexity of SoC designs is having an impact on simulation performance, leading to limited SoC exploration potential, which in turns affects development and verification schedules and time-to-market for new designs. […]
Jun, 14

Performance Gains in Conjugate Gradient Computation with Linearly Connected GPU Multiprocessors

Conjugate gradient is an important iterative method used for solving least squares problems. It is compute-bound and generally involves only simple matrix computations. One would expect that we could fully parallelize such computation on the GPU architecture with multiple Stream Multiprocessors (SMs), each consisting of many SIMD processing units. While implementing a conjugate gradient method […]
Jun, 14

Exploiting Unexploited Computing Resources for Computational Logics

We present an investigation of the use of GPGPU techniques to parallelize the execution of a satisfiability solver, based on the traditional DPLL procedure – which, in spite of its simplicity, still represents the core of the most competitive solvers. The investigation tackles some interesting problems, including the use of a predominantly data-parallel architecture, like […]
Jun, 14

Parakeet: A Just-In-Time Parallel Accelerator for Python

High level productivity languages such as Python or Matlab enable the use of computational resources by nonexpert programmers. However, these languages often sacrifice program speed for ease of use. This paper proposes Parakeet, a library which provides a just-in-time (JIT) parallel accelerator for Python. Parakeet bridges the gap between the usability of Python and the […]
Jun, 13

Using Fermi architecture knowledge to speed up CUDA and OpenCL programs

The NVIDIA graphics processing units (GPUs) are playing an important role as general purpose programming devices. The implementation of parallel codes to exploit the GPU hardware architecture is a task for experienced programmers. The threadblock size and shape choice is one of the most important user decisions when a parallel problem is coded. The threadblock […]
Jun, 13

A Consumer Application for GPGPUs: Desktop Search

To date, the GPGPU approach has been mainly utilized for academic and scientific computing, for example, for genetic algorithms, image analysis, cryptography, or password cracking. Though video cards supporting GPGPU have become pervasive, there do not appear to be any applications utilizing GPGPU for a household user. In this paper, one consumer application for GPGPU […]
Jun, 13

Experiences with High-Level Programming Directives for Porting Applications to GPUs

HPC systems now exploit GPUs within their compute nodes to accelerate program performance. As a result, high-end application development has become extremely complex at the node level. In addition to restructuring the node code to exploit the cores and specialized devices, the programmer may need to choose a programming model such as OpenMP or CPU […]
Jun, 13

A comparison of CPU and GPU performance for Fourier pseudospectral simulations of the Navier-Stokes, Cubic Nonlinear Schrodinger and Sine Gordon Equations

We report results comparing the performance of pseudospectral methods on a single CPU and a single GPU. Our CPU implementations use FFTW and we compare serial and OpenMP implementations. Our implementations for Nvidia GPUs use CuFFT and we compare the performance of PGI FORTRAN CUDA, Nvidia CUDA and PGI OpenACC compilers for similar algorithms.
Jun, 13

Fluid Dynamics Simulations on Multi-GPU Systems

The thesis describes the original design, implementation and testing of the multi-GPU version of two fluid flow simulation models, focusing on the cellular automaton MAGFLOW lava flow simulator and the GPU-SPH model for Navier-Stokes. In both cases, a spatial subdivision of the domain is performed, with a minimal overlap to ensure the correct evaluation of […]
Jun, 11

Large, Pruned or Continuous Space Language Models on a GPU for Statistical Machine Translation

Language models play an important role in large vocabulary speech recognition and statistical machine translation systems. The dominant approach since several decades are back-off language models. Some years ago, there was a clear tendency to build huge language models trained on hundreds of billions of words. Lately, this tendency has changed and recent works concentrate […]
Jun, 11

GPUSync: Architecture-Aware Management of GPUs for Predictable Multi-GPU Real-Time Systems

The integration of graphics processing units (GPUs) into real-time systems has recently become an active area of research. However, prior research on this topic has failed to produce real-time GPU allocation methods that fully exploit the available parallelism in GPU-enabled systems. In this paper, a GPU management framework called GPUSync is described that was designed […]

* * *

* * *

HGPU group © 2010-2024 hgpu.org

All rights belong to the respective authors

Contact us: