Posts
Nov, 22
Algorithm level power efficiency optimization for CPU-GPU processing element in data intensive SIMD/SPMD computing
Power efficiency investigation has been required in each level of a High Performance Computing (HPC) system because of the increasing computation demands of scientific and engineering applications. Focusing on handling the critical design constraints in software level that run beyond a parallel system composed of huge numbers of power-hungry components, we optimize HPC program design […]
Nov, 22
Higher-order CFD and Interface Tracking Methods on Highly-Parallel MPI and GPU systems
A computational investigation of the effects on parallel performance of higher-order accurate schemes was carried out on two different computational systems: a traditional CPU based MPI cluster and a system of four Graphics Processing Units (GPUs) controlled by a single quad-core CPU. The investigation was based on the solution of the level set equations for […]
Nov, 22
Efficient simulation of agent-based models on multi-GPU and multi-core clusters
An effective latency-hiding mechanism is presented in the parallelization of agent-based model simulations (ABMS) with millions of agents. The mechanism is designed to accommodate the hierarchical organization as well as heterogeneity of current state-of-the-art parallel computing platforms. We use it to explore the computation vs. communication trade-off continuum available with the deep computational and memory […]
Nov, 22
GPU-accelerated elastic 3D image registration for intra-surgical applications
Local motion within intra-patient biomedical images can be compensated by using elastic image registration. The application of B-spline based elastic registration during interventional treatment is seriously hampered by its considerable computation time. The graphics processing unit (GPU) can be used to accelerate the calculation of such elastic registrations by using its parallel processing power, and […]
Nov, 22
GPU accelerated tensor contractions in the plaquette renormalization scheme
We use the graphical processing unit (GPU) to accelerate the tensor contractions, which is the most time consuming operations in the variational method based on the plaquette renormalized states. Using a frustrated Heisenberg J1-J2 model on a square lattice as an example, we implement the algorithm based on the compute unified device architecture (CUDA). For […]
Nov, 22
GPU-accelerated phase-field simulation of dendritic solidification in a binary alloy
The phase-field simulation for dendritic solidification of a binary alloy has been accelerated by using a Graphic Processing Unit (GPU). To perform the phase-field simulation of the alloy solidification on GPU, a program code was developed with Computer Unified Device Architecture (CUDA). In this paper, the implementation technique of the phase-field model on GPU is […]
Nov, 22
GPU-accelerated molecular modeling coming of age
Graphics processing units (GPUs) have traditionally been used in molecular modeling solely for visualization of molecular structures and animation of trajectories resulting from molecular dynamics simulations. Modern GPUs have evolved into fully programmable, massively parallel co-processors that can now be exploited to accelerate many scientific computations, typically providing about one order of magnitude speedup over […]
Nov, 22
Accelerating electrostatic surface potential calculation with multi-scale approximation on graphics processing units
Tools that compute and visualize biomolecular electrostatic surface potential have been used extensively for studying biomolecular function. However, determining the surface potential for large biomolecules on a typical desktop computer can take days or longer using currently available tools and methods. Two commonly used techniques to speed-up these types of electrostatic computations are approximations based […]
Nov, 22
An Efficient Implementation of GPU Virtualization in High Performance Clusters
Current high performance clusters are equipped with high bandwidth/low latency networks, lots of processors and nodes, very fast storage systems, etc. However, due to economical and/or power related constraints, in general it is not feasible to provide an accelerating co-processor – such as a graphics processor (GPU) – per node. To overcome this, in this […]
Nov, 21
Megapixel Topology Optimization on a Graphics Processing Unit
We show how the computational power and programmability of modern graphics processing units (GPUs) can be used to efficiently solve large-scale pixel-based material distribution problems using a gradient-based optimality criterion method. To illustrate the principle, a so-called topology optimization problem that results in a constrained nonlinear programming problem with over 4 million decision variables is […]
Nov, 21
A high performance agent based modelling framework on graphics card hardware with CUDA
We present an efficient implementation of a high performance parallel framework for Agent Based Modelling (ABM), exploiting the parallel architecture of the Graphics Processing Unit (GPU). It provides a mapping between formal agent specifications, with C based scripting, and optimised NVIDIA Compute Unified Device Architecture (CUDA) code. The mapping of agent data structures and agent […]
Nov, 21
Breaking ECC2K-130
Elliptic-curve cryptography is becoming the standard public-key primitive not only for mobile devices but also for high-security applications. Advantages are the higher cryptographic strength per bit in comparison with RSA and the higher speed in implementations. To improve understanding of the exact strength of the elliptic-curve discrete-logarithm problem, Certicom has published a series of challenges. […]