Posts
Feb, 14
Developing Performance-Portable Molecular Dynamics Kernels in OpenCL
This paper investigates the development of a molecular dynamics code that is highly portable between architectures. Using OpenCL, we develop an implementation of Sandia’s miniMD benchmark that achieves good levels of performance across a wide range of hardware: CPUs, discrete GPUs and integrated GPUs. We demonstrate that the performance bottlenecks of miniMD’s short-range force calculation […]
Feb, 14
Exploring SIMD for Molecular Dynamics, Using Intel Xeon Processors and Intel Xeon Phi Coprocessors
We analyse gather-scatter performance bottlenecks in molecular dynamics codes and the challenges that they pose for obtaining benefits from SIMD execution. This analysis informs a number of novel code-level and algorithmic improvements to Sandia’s miniMD benchmark, which we demonstrate using three SIMD widths (128-, 256- and 512-bit). The applicability of these optimisations to wider SIMD […]
Feb, 14
Enhancing Performance of Meshfree Methods by Hybrid Computing
Hybrid computing technique is used in this study to significantly enhance the performance of meshfree methods. These methods are typically slower than finite element methods (FEM) mostly because their stiffness matrices are much denser ones formed by FEM. As a result, both forming stiffness matrices and solving equations are much slower. In this paper, we […]
Feb, 14
The Dual-Path Execution Model for Efficient GPU Control Flow
Current graphics processing units (GPUs) utilize the single instruction multiple thread (SIMT) execution model. With SIMT, a group of logical threads executes such that all threads in the group execute a single common instruction on a particular cycle. To enable control flow to diverge within the group of threads, GPUs partially serialize execution and follow […]
Feb, 14
Nonlinear dynamic finite element analysis with GPU
Newmark family of algorithms have been utilized by many engineering applications for the solution of nonlinear dynamic analysis of various structural models. Dynamic and nonlinear nature of such problems and numerical stability requirements of the algorithms increase the need for computation power in order to achieve practical solution times. Thus, this study intends to decrease […]
Feb, 14
Efficient Exploitation of Heterogeneous Platforms for Images Features Extraction
Image processing algorithms present a necessary tool for various domains related to computer vision, such as video surveillance, medical imaging, pattern recognition, etc. However, these algorithms are hampered by their high consumption of both computing power and memory, which increase significantly when processing large sets of high resolution images. In this work, we propose a […]
Feb, 14
Partial Volume Effect Correction using Anisotropic Backward Diffusion
This paper proposes an algorithm for correcting Partial Volume Effect in Positron Emission Tomography (PET) images, using registered Computed Tomography (CT) data to enhance the blurred PET image. The algorithm is based on a forward-and-backward anisotropic heat equation solver, deblurring the PET image along CT gradients. A forward diffusion force is also utilized to stabilize […]
Feb, 14
Finite-size scaling method for the Berezinskii-Kosterlitz-Thouless transition
We present an improved finite-size scaling method for reliably extracting the critical temperature T_BKT of a Berezinskii-Kosterlitz-Thouless (BKT) transition. Using the known Weber-Minhagen multiplicative logarithmic correction to the spin stiffness rho_s at T_BKT and the Kosterlitz-Nelson relation between the transition temperature and the stiffness, rho_s(T_BKT)=2T_BKT/pi, we define a size dependent transition temperature T_ BKT(L_1,L_2) based […]
Feb, 14
Evaluation of Standardized Password-based Key Derivation against Parallel Processing Platforms
Passwords are still the preferred method of user authentication for a large number of applications. In order to derive cryptographic keys from (human-entered) passwords, key-derivation functions are used. One of the most well-known key-derivation functions is the standardized PBKDF2 (RFC2898), which is used in TrueCrypt, CCMP of WPA2, and many more. In this work, we […]
Feb, 14
Real-Time 3D Face Identification from a Depth Camera
We present a real-time 3D face identification system using a consumer level depth camera (PrimeSensor). Our system takes a noisy sequence as input and produces reliable identification. Instead of registering a probe to all instances in the database, we propose to only register it with several intermediate references, which considerably reduces processing, while preserving the […]
Feb, 13
ADBIS workshop on GPUs In Databases, GID 2013
High performance of modern Graphics Processing Units may be utilized not only for graphics related application but also for general computing. This computing power has been utilized in new variants of many algorithms from almost every computer science domain. Unfortunately, while other application domains strongly benefit from utilizing the GPUs, databases related applications seem not […]
Feb, 12
Exploiting Multiple Levels of Parallelism and Online Refinement of Unstructured Meshes in Atmospheric Model Application
Weather forecasts for long periods of time has emerged as increasingly important. The global concern with the consequences of climate changes has stimulated researches to determine the climate in coming decades. At the same time the steps needed to better defining the modeling and the simulation of climate/weather is far of the desired accuracy. Upscaling […]