Posts
Aug, 4
Porting marine ecosystem model spin-up using transport matrices to GPUs
We have ported an implementation of the spin-up for marine ecosystem models based on the "Transport Matrix Method" to graphics processing units (GPUs). The original implementation was designed for distributed-memory architectures and uses the PETSc library that is based on the "Message Passing Interface (MPI)" standard. The spin-up computes a steady seasonal cycle of the […]
Aug, 3
Solving very large instances of the scheduling of independent tasks problem on the GPU
In this paper, we present two new parallel algorithms to solve large instances of the scheduling of independent tasks problem. First, we describe a parallel version of the Min-min heuristic. Second, we present GraphCell, an advanced parallel cellular genetic algorithm (CGA) for the GPU. Two new generic recombination operators that take advantage of the massive […]
Aug, 3
Power Management for GPU-CPU Heterogeneous Systems
In recent years, GPU-CPU heterogeneous architectures have been increasingly adopted in high performance computing, because of their capabilities of providing high computational throughput. However, current research focuses mainly on the performance aspects of GPU-CPU architectures, while improving the energy efficiency of such systems receives much less attention. There are few existing efforts that try to […]
Aug, 3
GPU-to-GPU and Host-to-Host multipattern string matching on a GPU
We develop GPU adaptations of the Aho-Corasick and multipattern Boyer-Moore string matching algorithms for the two cases GPU-to-GPU (input is initially in GPU memory and the output is left in GPU memory) and host-to-host (input and output are in the memory of the host CPU). For the GPU-to-GPU case, we consider several refinements to a […]
Aug, 3
Parallel Statistical Analysis of Analog Circuits by GPU-accelerated Graph-based Approach
In this paper, we propose a new parallel statistical analysis method for large analog circuits using determinant decision diagram (DDD) based graph technique based on GPU platforms. DDD-based symbolic analysis technique enables exact symbolic analysis of vary large analog circuits. But we show that DDD-based graph analysis is very amenable for massively threaded based parallel […]
Aug, 3
Performance Analysis of GPU Accelerators with Realizable Utilization of Computational Density
With the rising number of application accelerators, developers are looking for ways to evaluate new and competing platforms quickly, fairly, and early in the development cycle. As high-performance computing (HPC) applications increase their demands on application acceleration platforms, graphics processing units (GPUs) provide a potential solution for many developers looking for increased performance. Device performance […]
Aug, 2
Parallelizing flow-accumulation calculations on graphics processing units – From iterative DEM preprocessing algorithm to recursive multiple-flow-direction algorithm
As one of the important tasks in digital terrain analysis, the calculation of flow accumulations from gridded digital elevation models (DEMs) usually involves two steps in a real application: (1) using an iterative DEM preprocessing algorithm to remove the depressions and flat areas commonly contained in real DEMs, and (2) using a recursive flow-direction algorithm […]
Aug, 2
Automated Tool to Generate Parallel CUDA code from a Serial C Code
With the introduction of GPGPUs, parallel programming has become simple and affordable. APIs such as NVIDIA’s CUDA have attracted many programmers to port their applications to GPGPUs. But writing CUDA codes still remains a challenging task. Moreover, the vast repositories of legacy serial C codes, which are still in wide use in the industry, are […]
Aug, 2
C to Cellular Automata and Execution on CPU, GPU and FPGA
Over the last decades Cellular Automata (CA) have become more and more present in solving general-purpose problems, but the main issue is how to map a problem to a Cellular Automata model. Special languages were developed for programming such models, but learning a new programming language is very time consuming. Furthermore software developers have to […]
Aug, 2
Accelerating Pathology Image Data Cross-Comparison on CPU-GPU Hybrid Systems
As an important application of spatial databases in pathology imaging analysis, cross-comparing the spatial boundaries of a huge amount of segmented micro-anatomic objects demands extremely data- and compute-intensive operations, requiring high throughput at an affordable cost. However, the performance of spatial database systems has not been satisfactory since their implementations of spatial operations cannot fully […]
Aug, 2
A GPU-Computing Approach to Solar Stokes Profile Inversion
We present a new computational approach to the inversion of solar photospheric Stokes polarization profiles, under the Milne-Eddington model, for vector magnetography. Our code, named GENESIS (GENEtic Stokes Inversion Strategy), employs multi-threaded parallel-processing techniques to harness the computing power of graphics processing units GPUs, along with algorithms designed to exploit the inherent parallelism of the […]
Aug, 1
Interference-driven resource management for GPU-based heterogeneous clusters
GPU-based clusters are increasingly being deployed in HPC environments to accelerate a variety of scientific applications. Despite their growing popularity, the GPU devices themselves are under-utilized even for many computationally-intensive jobs. This stems from the fact that the typical GPU usage model is one in which a host processor periodically offloads computationally intensive portions of […]