Posts
Sep, 13
Exploring Multiple Dimensions of Parallelism in Junction Tree Message Passing
Belief propagation over junction trees is known to be computationally challenging in the general case. One way of addressing this computational challenge is to use node-level parallel computing, and parallelize the computation associated with each separator potential table cell. However, this approach is not efficient for junction trees that mainly contain small separators. In this […]
Sep, 13
Recent progress and challenges in exploiting graphics processors in computational fluid dynamics
The progress made in accelerating simulations of fluid flow using GPUs, and the challenges that remain, are surveyed. The review first provides an introduction to GPU computing and programming, and discusses various considerations for improved performance. Case studies comparing the performance of CPU- and GPU- based solvers for the Laplace and incompressible Navier-Stokes equations are […]
Sep, 13
Accelerating moderately stiff chemical kinetics in reactive-flow simulations using GPUs
The chemical kinetics ODEs arising from operator-split reactive-flow simulations were solved on GPUs using explicit integration algorithms. Nonstiff chemical kinetics of a hydrogen oxidation mechanism (9 species and 38 irreversible reactions) were computed using the explicit fifth-order Runge-Kutta-Cash-Karp method, and the GPU-accelerated version performed faster than single- and six-core CPU versions by factors of 126 […]
Sep, 13
A massively parallel program to solve the phase field formulation for crack propagation
Phase field models for fracture employ a continuous field variable to model cracks. Therefore, in contrast to discrete descriptions of fracture, numerical tracking of discontinuities in the displacement field are not required. This really reduces implementation complexity. In this paper, we discuss the use of a single graphical processing unit (GPU) to accelerate the solution […]
Sep, 13
Simulation and modeling of physical broadcasts
The environment around us has many phenomena and has different behaviors according to different parameters, biological, chemical, physical, etc. To represent a simple and abstract reality of this environment we use a concept called environmental modeling. The environmental modeling deals with many environmental problems such as air pollution, diffusion of disease, animal behavior and so […]
Sep, 13
Neptune: An astrophysical smooth particle hydrodynamics code for massively parallel computer architectures
Smooth particle hydrodynamics is an efficient method for modeling the dynamics of fluids. It is commonly used to simulate astrophysical processes such as binary mergers. We present a newly developed GPU accelerated smooth particle hydrodynamics code for astrophysical simulations. The code is named neptune after the Roman god of water. It is written in OpenMP […]
Sep, 13
Fast computation of computer-generated hologram using Xeon Phi coprocessor
We report fast computation of computer-generated holograms (CGHs) using Xeon Phi coprocessors, which have massively x86-based processors on one chip, recently released by Intel. CGHs can generate arbitrary light wavefronts, and therefore, are promising technology for many applications: for example, three-dimensional displays, diffractive optical elements, and the generation of arbitrary beams. CGHs incur enormous computational […]
Sep, 11
Histogram Computations on GPUs Kernel using Global and Shared Memory Atomics
In this paper we implement histogram computations on a Graphics Processing Unit (GPU). Our Histogram computations is implemented using compute unified device architecture (CUDA) which is a minimal extension to C/C++. In this development Histogram computations, computed on GPU’s global memory as well as on shared memory. We also perform Histogram computations on CPU and […]
Sep, 11
Evaluating the Viability of Application-Driven Cooperative CPU/GPU Fault Detection
Trends in high performance computing are bringing increased heterogeneity among the computational resources within a single machine. The heterogeneous CPU/GPU platforms, however, exacerbate resilience problems faced by current large-scale systems. How to design efficient resilience strategies is critical for the wider adoption of heterogeneous platforms for future exascale systems. The conventional resilience strategy for GPU […]
Sep, 11
Coherent transport by adiabatic passage on atom chips
Adiabatic techniques offer some of the most promising tools to achieve high-fidelity control of the centre-of-mass degree of freedom of single atoms. As their main requirement is to follow an eigenstate of the system, constraints on timing and field strength stability are usually low, especially for trapped systems. In this paper we present a detailed […]
Sep, 11
D5.5.4 – Characterization of Redundancy and Definition of Work Reuse
This task involves the following work: – Establishing the relation of Quality of Service (QoS) and energy to accuracy. – Design and development of techniques to dynamically decrease accuracy (e.g., ignore low order bits in computations). Deliberately ignoring a few low order bits in calculations where the application allows it (in terms of impact to […]
Sep, 11
Hardware-Oblivious Parallelism for In-Memory Column-Stores
The multi-core architectures of today’s computer systems make parallelism a necessity for performance critical applications. Writing such applications in a generic, hardware-oblivious manner is a challenging problem: Current database systems thus rely on labor-intensive and error-prone manual tuning to exploit the full potential of modern parallel hardware architectures like multi-core CPUs and graphics cards. We […]