## Posts

Nov, 22

### Higher-order CFD and Interface Tracking Methods on Highly-Parallel MPI and GPU systems

A computational investigation of the effects on parallel performance of higher-order accurate schemes was carried out on two different computational systems: a traditional CPU based MPI cluster and a system of four Graphics Processing Units (GPUs) controlled by a single quad-core CPU. The investigation was based on the solution of the level set equations for […]

Nov, 22

### Efficient simulation of agent-based models on multi-GPU and multi-core clusters

An effective latency-hiding mechanism is presented in the parallelization of agent-based model simulations (ABMS) with millions of agents. The mechanism is designed to accommodate the hierarchical organization as well as heterogeneity of current state-of-the-art parallel computing platforms. We use it to explore the computation vs. communication trade-off continuum available with the deep computational and memory […]

Nov, 22

### GPU-accelerated elastic 3D image registration for intra-surgical applications

Local motion within intra-patient biomedical images can be compensated by using elastic image registration. The application of B-spline based elastic registration during interventional treatment is seriously hampered by its considerable computation time. The graphics processing unit (GPU) can be used to accelerate the calculation of such elastic registrations by using its parallel processing power, and […]

Nov, 22

### GPU accelerated tensor contractions in the plaquette renormalization scheme

We use the graphical processing unit (GPU) to accelerate the tensor contractions, which is the most time consuming operations in the variational method based on the plaquette renormalized states. Using a frustrated Heisenberg J1-J2 model on a square lattice as an example, we implement the algorithm based on the compute unified device architecture (CUDA). For […]

Nov, 22

### GPU-accelerated phase-field simulation of dendritic solidification in a binary alloy

The phase-field simulation for dendritic solidification of a binary alloy has been accelerated by using a Graphic Processing Unit (GPU). To perform the phase-field simulation of the alloy solidification on GPU, a program code was developed with Computer Unified Device Architecture (CUDA). In this paper, the implementation technique of the phase-field model on GPU is […]

Nov, 22

### GPU-accelerated molecular modeling coming of age

Graphics processing units (GPUs) have traditionally been used in molecular modeling solely for visualization of molecular structures and animation of trajectories resulting from molecular dynamics simulations. Modern GPUs have evolved into fully programmable, massively parallel co-processors that can now be exploited to accelerate many scientific computations, typically providing about one order of magnitude speedup over […]

Nov, 22

### Accelerating electrostatic surface potential calculation with multi-scale approximation on graphics processing units

Tools that compute and visualize biomolecular electrostatic surface potential have been used extensively for studying biomolecular function. However, determining the surface potential for large biomolecules on a typical desktop computer can take days or longer using currently available tools and methods. Two commonly used techniques to speed-up these types of electrostatic computations are approximations based […]

Nov, 22

### An Efficient Implementation of GPU Virtualization in High Performance Clusters

Current high performance clusters are equipped with high bandwidth/low latency networks, lots of processors and nodes, very fast storage systems, etc. However, due to economical and/or power related constraints, in general it is not feasible to provide an accelerating co-processor – such as a graphics processor (GPU) – per node. To overcome this, in this […]

Nov, 21

### Megapixel Topology Optimization on a Graphics Processing Unit

We show how the computational power and programmability of modern graphics processing units (GPUs) can be used to efficiently solve large-scale pixel-based material distribution problems using a gradient-based optimality criterion method. To illustrate the principle, a so-called topology optimization problem that results in a constrained nonlinear programming problem with over 4 million decision variables is […]

Nov, 21

### Breaking ECC2K-130

Elliptic-curve cryptography is becoming the standard public-key primitive not only for mobile devices but also for high-security applications. Advantages are the higher cryptographic strength per bit in comparison with RSA and the higher speed in implementations. To improve understanding of the exact strength of the elliptic-curve discrete-logarithm problem, Certicom has published a series of challenges. […]

Nov, 21

### A high performance agent based modelling framework on graphics card hardware with CUDA

We present an efficient implementation of a high performance parallel framework for Agent Based Modelling (ABM), exploiting the parallel architecture of the Graphics Processing Unit (GPU). It provides a mapping between formal agent specifications, with C based scripting, and optimised NVIDIA Compute Unified Device Architecture (CUDA) code. The mapping of agent data structures and agent […]

Nov, 21

### Large-scale deep unsupervised learning using graphics processors

The promise of unsupervised learning methods lies in their potential to use vast amounts of unlabeled data to learn complex, highly nonlinear models with millions of free parameters. We consider two well-known unsupervised learning models, deep belief networks (DBNs) and sparse coding, that have recently been applied to a flurry of machine learning applications (Hinton […]