
Oct, 2

CADDIES: A New Framework for Rapid Development of Parallel Cellular Automata Algorithms for Flood Simulation

A recent trend in the development of flood simulation algorithms shows the move toward fast simplified models instead of slow full hydrodynamic models. CADDIES is a research project that aims to develop a real/near-real time pluvial urban flood simulation model using the computational speed of cellular automata (CA) algorithms. This paper presents a component of […]
Oct, 2

Material Removal Simulation and Cutting Force Prediction of Multi-Axis Machining Processes on General-Purpose Graphics Processing Units

The efficient planning of automated machining processes is unthinkable without the use of offline CAM systems. Though machining programs can be written and input manually, right at the machine controller, if the workpiece geometry is complex, or if the machined features are numerous, the help of CAM software is essential for generating the program both […]
Oct, 2

Exploiting Limited Access Distance of ODE Systems for Parallelism and Locality in Explicit Methods

The solution of initial value problems of large systems of ordinary differential equations (ODEs) is computationally intensive and demands for efficient parallel solution techniques that take into account the complex architectures of modern parallel computer systems. This article discusses implementation techniques suitable for ODE systems with a special coupling structure, called limited access distance, which […]
Oct, 2

Parallelizing LINQ Program for GPGPU

Recent technologies have brought parallel infrastructure to general users. Nowa-days parallel infrastructure is available in PC’s and personal laptops. Now single core machines have became history. Even multi-core technologies are replaced by GPGPUs when it comes to high performance computing because GPGPUs are giv-ing many cores at low cost. Sequential programs of the past are […]
Oct, 2

Multi2Sim: a simulation framework for CPU-GPU computing

Accurate simulation is essential for the proper design and evaluation of any computing platform. Upon the current move toward the CPU-GPU heterogeneous computing era, researchers need a simulation framework that can model both kinds of computing devices and their interaction. In this paper, we present Multi2Sim, an open-source, modular, and fully configurable toolset that enables […]
Oct, 1

Parallel Application Library for Object Recognition

Computer vision research enables machines to understand the world. Humans usually interpret and analyze the world through what they see – the objects they capture with their eyes. Similarly, machines can better understand the world by recognizing objects in images. Object recognition is therefore a major branch of computer vision. To achieve the highest accuracy, […]
Oct, 1

Accelerated Pressure Projection using OpenCL on GPUs

A GPU version of the pressure projection solver using OpenCL is implemented. Then it has been compared with CPU version which is accelerated with OpenMP. The GPU version shows a sensible reduction in time despite using a simple algorithm in the kernel. The nal code is plugged into a commercial uid simulator software. Dierent kinds […]
Oct, 1

GPGPU Accelerated Texture-Based Radiosity Calculation

Radiosity is a popular global illumination algorithm capable of achieving photorealistic rendering results. However, its use in interactive environments is limited by its computational complexity. This paper presents a GPGPU-based implementation of the gathering radiosity approach using texture-based discretisation and the OpenCL framework. Hemicubes are rendered to a texture array and are processed by OpenCL […]
Oct, 1

Compute Distance Matrices with GPU

Given a data matrix where the rows are objects and the columns are variables, researchers often want to compute all the pairwise distances among the objects. Due to the design of Nvidia GPU architecture, CUDA code can work with ease data matrices where the numbers of rows and columns are multiples of sixteen. The present […]
Oct, 1

Synthesizing Structured Traversals from Attribute Grammars

We examine how to automatically decompose a program into structured parallel traversals over trees. In our system, programs are declaratively specified as attribute grammars and parallel traversals are defined by a compiler designed to optimize them for both GPUs and multicore CPUs. Our synthesizer automatically finds a correct schedule of the attribute grammar as structured […]
Sep, 30

CUDA-Zero: a framework for porting shared memory GPU applications to multi-GPUs

As the prevalence of general purpose computations on GPU, shared memory programming models were proposed to ease the pain of GPU programming. However, with the demanding needs of more intensive workloads, it’s desirable to port GPU programs to more scalable distributed memory environment, such as multi-GPUs. To achieve this, programs need to be re-written with […]
Sep, 30

Nonperturbative Quantum Field Theory in Astrophysics

The extreme electromagnetic or gravitational fields associated with some astrophysical objects can give rise to macroscopic effects arising from the physics of the quantum vacuum. Therefore, these objects are incredible laboratories for exploring the physics of quantum field theories. In this dissertation, we explore this idea in three astrophysical scenarios.

* * *

* * *

HGPU group © 2010-2025 hgpu.org

All rights belong to the respective authors

Contact us: