9196

Posts

Mar, 31

Specification and Verification of GPGPU Programs using Permission-Based Separation Logic

Graphics Processing Units (GPUs) are increasingly used for general-purpose applications because of their low price, energy efficiency and enormous computing power. Considering the importance of GPU applications, it is vital that the behaviour of GPU programs can be specified and proven correct formally. This paper presents our ideas how to verify GPU programs written in […]
Mar, 31

A journey from single-GPU to optimized multi-GPU SPH with CUDA

We present an optimized multi-GPU version of GPUSPH, a CUDA implementation of fluid-dynamics models based on the Smoothed Particle Hydrodynamics (SPH) numerical method. SPH is a well-known Lagrangian model for the simulation of free-surface fluid flows; it exposes a high degree of parallelism and has already been successfully ported to GPU. We extend the GPU-based […]
Mar, 31

A Massively Parallel Associative Memory Based on Sparse Neural Networks

Associative memories store content in such a way that the content can be later retrieved by presenting the memory with a small portion of the content, rather than presenting the memory with an address as in more traditional memories. Associative memories are used as building blocks for algorithms within database engines, anomaly detection systems, compression […]
Mar, 31

CMCpy: Genetic Code-Message Coevolution Models in Python

Code-message coevolution (CMC) models represent coevolution of a genetic code and a population of protein-coding genes ("messages"). Formally, CMC models are sets of quasispecies coupled together for fitness through a shared genetic code. Although CMC models display plausible explanations for the origin of multiple genetic code traits by natural selection, useful modern implementations of CMC […]
Mar, 29

High Performance Computing using GPGPU’s

Computer based simulation software having a basis in numerical methods play a major role in research in the area of natural and physical sciences. These tools allow scientists to attempt problems that are too large to solve using analytical methods. But even these tools can fail to give solutions due to computational or storage limits. […]
Mar, 29

Warp Size Impact in GPUs: Large or Small?

There are a number of design decisions that impact a GPU’s performance. Among such decisions deciding the right warp size can deeply influence the rest of the design. Small warps reduce the performance penalty associated with branch divergence at the expense of a reduction in memory coalescing. Large warps enhance memory coalescing significantly but also […]
Mar, 29

Graphics Processing Unit Acceleration of the Explicit Solution of the Time Domain Volume Integral Equation Using OpenACC

A graphics processing unit (GPU) accelerated implementation of the explicit solution of the time domain volume integral equation (TD-VIE) using the OpenACC application program interface (API) is presented. The use of the OpenACC API, which is based on a collection of compiler directives implementation, allows for the ease of porting as well as the efficient […]
Mar, 29

Parallel Simulation of Population Balance Model-Based Particulate Processes Using Multicore CPUs and GPUs

Computer-aided modeling and simulation are a crucial step in developing, integrating, and optimizing unit operations and subsequently the entire processes in the chemical/pharmaceutical industry. This study details two methods of reducing the computational time to solve complex process models, namely, the population balance model which given the source terms can be very computationally intensive. Population […]
Mar, 29

Accelerating Graph Analysis with Heterogeneous Systems

Data analysis is a rising field of interest for computer science research due to the growing amount of information that is digitally available. This increase in data has as direct consequence that any analysis is significantly complex. By using structured representations for the data sets, like graphs, the analysis becomes feasible, but is still time-consuming. […]
Mar, 26

Adaptive OpenCL (ACL) Execution in GPU Architectures

Open Compute Language (OpenCL) has been proposed as a platform-independent, parallel execution model to target heterogeneous systems, including multiple central processing units, graphics processing units (GPUs), and digital signal processors (DSPs). OpenCL parallelism scales with the available resources and hardware generational improvements due to the data-parallel nature of its kernels. Such parallel expressions must adhere […]
Mar, 26

Parallel Sorting on the Heterogeneous AMD Fusion Accelerated Processing Unit

We explore efficient parallel radix sort for the AMD Fusion Accelerated Processing Unit (APU). Two challenges arise: efficiently partitioning data between the CPU and GPU and the allocation of data in memory regions. Our coarse-grained implementation utilizes both the GPU and CPU by sharing data at the begining and end of the sort. Our fine-grained […]
Mar, 26

Accelerated Dictionary Learning with GPU/Multicore CPU and Its Application to Music Classification

K-means clustering and GMM training, as dictionary learning procedures, lie at the heart of many signal processing applications. Increasing data scale requires more efficient ways to perform this process. In this paper a new GPU and multi-core CPU accelerated k-means clustering and GMM training is proposed. We show that both methods can be concisely reformulated […]

* * *

* * *

HGPU group © 2010-2024 hgpu.org

All rights belong to the respective authors

Contact us: