Posts
Sep, 13
Accelerating moderately stiff chemical kinetics in reactive-flow simulations using GPUs
The chemical kinetics ODEs arising from operator-split reactive-flow simulations were solved on GPUs using explicit integration algorithms. Nonstiff chemical kinetics of a hydrogen oxidation mechanism (9 species and 38 irreversible reactions) were computed using the explicit fifth-order Runge-Kutta-Cash-Karp method, and the GPU-accelerated version performed faster than single- and six-core CPU versions by factors of 126 […]
Sep, 13
A massively parallel program to solve the phase field formulation for crack propagation
Phase field models for fracture employ a continuous field variable to model cracks. Therefore, in contrast to discrete descriptions of fracture, numerical tracking of discontinuities in the displacement field are not required. This really reduces implementation complexity. In this paper, we discuss the use of a single graphical processing unit (GPU) to accelerate the solution […]
Sep, 13
Simulation and modeling of physical broadcasts
The environment around us has many phenomena and has different behaviors according to different parameters, biological, chemical, physical, etc. To represent a simple and abstract reality of this environment we use a concept called environmental modeling. The environmental modeling deals with many environmental problems such as air pollution, diffusion of disease, animal behavior and so […]
Sep, 13
Neptune: An astrophysical smooth particle hydrodynamics code for massively parallel computer architectures
Smooth particle hydrodynamics is an efficient method for modeling the dynamics of fluids. It is commonly used to simulate astrophysical processes such as binary mergers. We present a newly developed GPU accelerated smooth particle hydrodynamics code for astrophysical simulations. The code is named neptune after the Roman god of water. It is written in OpenMP […]
Sep, 13
Fast computation of computer-generated hologram using Xeon Phi coprocessor
We report fast computation of computer-generated holograms (CGHs) using Xeon Phi coprocessors, which have massively x86-based processors on one chip, recently released by Intel. CGHs can generate arbitrary light wavefronts, and therefore, are promising technology for many applications: for example, three-dimensional displays, diffractive optical elements, and the generation of arbitrary beams. CGHs incur enormous computational […]
Sep, 11
Histogram Computations on GPUs Kernel using Global and Shared Memory Atomics
In this paper we implement histogram computations on a Graphics Processing Unit (GPU). Our Histogram computations is implemented using compute unified device architecture (CUDA) which is a minimal extension to C/C++. In this development Histogram computations, computed on GPU’s global memory as well as on shared memory. We also perform Histogram computations on CPU and […]
Sep, 11
Evaluating the Viability of Application-Driven Cooperative CPU/GPU Fault Detection
Trends in high performance computing are bringing increased heterogeneity among the computational resources within a single machine. The heterogeneous CPU/GPU platforms, however, exacerbate resilience problems faced by current large-scale systems. How to design efficient resilience strategies is critical for the wider adoption of heterogeneous platforms for future exascale systems. The conventional resilience strategy for GPU […]
Sep, 11
Coherent transport by adiabatic passage on atom chips
Adiabatic techniques offer some of the most promising tools to achieve high-fidelity control of the centre-of-mass degree of freedom of single atoms. As their main requirement is to follow an eigenstate of the system, constraints on timing and field strength stability are usually low, especially for trapped systems. In this paper we present a detailed […]
Sep, 11
D5.5.4 – Characterization of Redundancy and Definition of Work Reuse
This task involves the following work: – Establishing the relation of Quality of Service (QoS) and energy to accuracy. – Design and development of techniques to dynamically decrease accuracy (e.g., ignore low order bits in computations). Deliberately ignoring a few low order bits in calculations where the application allows it (in terms of impact to […]
Sep, 11
Hardware-Oblivious Parallelism for In-Memory Column-Stores
The multi-core architectures of today’s computer systems make parallelism a necessity for performance critical applications. Writing such applications in a generic, hardware-oblivious manner is a challenging problem: Current database systems thus rely on labor-intensive and error-prone manual tuning to exploit the full potential of modern parallel hardware architectures like multi-core CPUs and graphics cards. We […]
Sep, 11
Iterative and Predictive Ray-Traced Collision Detection for Multi-GPU Architectures
Collision detection is a complex task that can be described simply: given a set of objects, we want to know which ones collide. In the literature, we can found numerous algorithms that depend on objects property, but we can’t find an overall solution that works on every objects. The internship focuses on a recent algorithm […]
Sep, 11
GPU Implementations of Object Detection using HOG Features and Deformable Models
Vision-based object detection using camera sensors is an essential piece of perception for autonomous vehicles. Various combinations of features and models can be applied to increase the quality and the speed of object detection. A well-known approach uses histograms of oriented gradients (HOG) with deformable models to detect a car in an image [15]. A […]