Posts
Aug, 9
Performance Evaluation of Feature Extraction Algorithm on GPGPU
Nvidia’s GPGPU based Compute Unified Device Architecture (CUDA) is a software platform for massively parallel high-performance computing on GPU. It provide several key abstractions- a hierarchy of thread block, shared memory, and barrier synchronization. This model has proven quite successful at programming multithreaded many core GPUs and scale transparently to hundreds of cores: many industry […]
Aug, 9
Cache Miss Analysis for GPU Programs Based on Stack Distance Profile
Using the graphics processing unit (GPU) to accelerate the general purpose computation has attracted much attention from both the academia and industry due to GPU’s powerful computing capacity. Thus optimization of GPU programs has become a popular research direction. In order to support the general purpose computing more efficiently, GPU has integrated the general data […]
Aug, 9
Matrix Multiplication on GPUs with On-Line Fault Tolerance
Commercial graphics processing units (GPUs) prove their attractive, inexpensive in high performance scientific applications. However, a recent research through Folding@home demonstrates that two-thirds of tested GPUs on Folding@home exhibit a detectable, pattern-sensitive rate of memory soft errors for GPGPU. Fault tolerance has been viewed as critical to the effective use of these GPUs. In this […]
Aug, 9
Optimization of parallel Genetic Algorithms for nVidia GPUs
Led by General Purpose computing over Graphical Processing Units (GPGPUs), the parallel computing area is witnessing a rapid change in dominant parallel systems. A major hurdle in this switch is the Single Instruction Multiple Thread (SIMT) architecture of GPUs which is usually not suitable for the design of legacy parallel algorithms. Genetic Algorithms (GAs) is […]
Aug, 9
In-process optical characterization method for sub-100-nm nanostructures
Optical measurements based on laser light scattering by nanostructures provide fast and contactless measurement of the surface of nanostructures for defects. In this paper, a novel in-process measurement method based on coherent laser light scattering by sub-100-nm structures is presented. It is shown that nanostructure defects can be identified by their unique scattering pattern. This […]
Aug, 8
High-Performance Diagnostic Fault Simulation on GPUs
In this paper, we present an efficient diagnostic fault simulator based on a state-of-the-art graphics processing unit (GPU). Diagnostic fault simulation plays an important role to identify and locate the causes of circuit failures. However, today’s complex VLSI circuits pose ever higher computational demand for such simulators. Our GPU based diagnostic fault simulator (GDSim) is […]
Aug, 8
Performance Comparison with OpenMP Parallelization for Multi-core Systems
Today, the multi-core processor has occupied more and more market shares, and the programming personnel also must face the collision brought by the revolution of multi-core processor. Semiconductor scaling limits and associated power and thermal challenges limit performance growth for single-core microprocessors. This reason leads many microprocessor vendors to turn instead to multi-core chip organizations. […]
Aug, 8
GPU Computing in EGI Environment Using a Cloud Approach
Recently GPU computing, namely the possibility to use the vector processors of graphics card as computational general purpose units of High Performance Computing environments, has generated considerable interest in the scientific community. Some communities in European Grid Infrastructure (EGI) are reshaping their applications to exploit this new programming paradigm. Each EGI community, called Virtual Organization […]
Aug, 8
AES finalists implementation for GPU and multi-core CPU based on OpenCL
Benefit from the OpenCL (Open Computing Language), applications can be easily transplanted among different GPUs, multi-core CPUs, and other processors. In this paper, we present implementation of AES finalists (Rijndael, Serpent, Twofish) in XTS mode, based on OpenCL. Benchmark testing is performed on 4 mainstream GPUs and multi-core CPUs. The results are also compared with […]
Aug, 8
The distributed diagonal force decomposition method for parallelizing molecular dynamics simulations
Parallelization is an effective way to reduce the computational time needed for molecular dynamics simulations. We describe a new parallelization method, the distributed-diagonal force decomposition method, with which we extend and improve the existing force decomposition methods. Our new method requires less data communication during molecular dynamics simulations than replicated data and current force decomposition […]
Aug, 8
DC Power Flow Based Contingency Analysis Using Graphics Processing Units (thesis)
This thesis explores the possibility of mapping power flow algorithms on a graphics processor. In particular we demonstrate the implementation of DC power flow based contingency analysis on a graphic processing unit (GPU). GPU’s are SIMD processors with highly streamlined architecture to support rendering of graphic images on the computer screen. However, in the recent […]
Aug, 8
DC Power Flow Based Contingency Analysis Using Graphics Processing Units
Graphic processing units (GPUs) are single instruction, multiple data processors which have become an integral part of modern high-end video cards installed on a general purpose PCs. This paper investigates the parallel implementation of DC power flow based contingency analysis on graphic processing units. Results for the IEEE standard test systems show a speed-up of […]