Posts
Jan, 24
Multi-GPU Implementation for Iterative MR Image Reconstruction with Field Correction
Many advanced MRI image acquisition and reconstruction methods see limited application due to high computational cost in MRI. For instance, iterative reconstruction algorithms (e.g. non-Cartesian k-space trajectory, or magnetic field inhomogeneity compensation) can improve image quality but suffer from low reconstruction speed. General-purpose computing on graphics processing units (GPU) have demonstrated significant performance speedups and […]
Jan, 24
Accelerating iterative field-compensated MR image reconstruction on GPUs
We propose a fast implementation for iterative MR image reconstruction using Graphics Processing Units (GPU). In MRI, iterative reconstruction with conjugate gradient algorithms allows for accurate modeling the physics of the imaging system. Specifically, methods have been reported to compensate for the magnetic field inhomogeneity induced by the susceptibility differences near the air/tissue interface in […]
Jan, 24
Data Layout Transformation for Structured-Grid Codes on GPU
We present data layout transformation as an effective performance optimization for memory-bound structuredgrid applications for GPUs. Structured grid applications are a class of applications that compute grid cell values on a regular 2D, 3D or higher dimensional regular grid. Each output point is computed as a function of itself and its nearest neighbors. Stencil code […]
Jan, 24
Program Optimization Strategies for Data-Parallel Many-Core Processors
Program optimization for highly parallel systems has historically been considered an art, with experts doing much of the performance tuning by hand. With the introduction of inexpensive, single-chip, massively parallel platforms, more developers will be creating highly data-parallel applications for these platforms while lacking the substantial experience and knowledge needed to maximize application performance. In […]
Jan, 24
Efficient Parallel Scan Algorithms for GPUs
Scan and segmented scan algorithms are crucial building blocks for a great many data-parallel algorithms. Segmented scan and related primitives also provide the necessary support for the flattening transform, which allows for nested data-parallel programs to be compiled into flat data-parallel languages. In this paper, we describe the design of efficient scan and segmented scan […]
Jan, 24
Efficient Sparse Matrix-Vector Multiplication on CUDA
The massive parallelism of graphics processing units (GPUs) offers tremendous performance in many high-performance computing applications. While dense linear algebra readily maps to such platforms, harnessing this potential for sparse matrix computations presents additional challenges. Given its role in iterative methods for solving sparse linear systems and eigenvalue problems, sparse matrix-vector multiplication (SpMV) is of […]
Jan, 23
Parallel Genetic Algorithms on Programmable Graphics Hardware
Parallel genetic algorithms are usually implemented on parallel machines or distributed systems. This paper describes how fine-grained parallel genetic algorithms can be mapped to programmable graphics hardware found in commodity PC. Our approach stores chromosomes and their fitness values in texture memory on graphics card. Both fitness evaluation and genetic operations are implemented entirely with […]
Jan, 23
Parallel Evolutionary Algorithms on Consumer-Level Graphics Processing Unit
Evolutionary Algorithms (EAs) are effective and robust methods for solving many practical problems such as feature selection, electrical circuits synthesis, and data mining. However, they may execute for a long time for some difficult problems, because several fitness evaluations must be performed. A promising approach to overcome this limitation is to parallelize these algorithms. In […]
Jan, 23
Parallel hybrid genetic algorithms on Consumer-Level graphics hardware
In this paper, we report a parallel hybrid genetic algorithm (HGA) on consumer-level graphics cards. HGA extends the classical genetic algorithm by incorporating the Cauchy mutation operator from evolutionary programming. In our parallel HGA, all steps except the random number generation procedure are performed in graphics processing unit (GPU) and thus our parallel HGA can […]
Jan, 23
Cellular Genetic Algorithms and Local Search for 3-SAT problem on Graphic Hardware
As a well known NP-hard problem, SAT problem is widely discussed by computer science society. In this paper, two common algorithms for SAT problems are implemented based on graphic hardware. They are greedy local search and genetic algorithm. After a brief description of the basic algorithm, we give our modification of the algorithm for fitting […]
Jan, 23
Evolutionary Computing on Consumer-Level Graphics Hardware
We propose implementing a parallel EA on consumer graphics cards, which we can find in many PCs. This lets more people use our parallel algorithm to solve large-scale, real-world problems such as data mining. Parallel evolutionary algorithms run on consumer-grade graphics hardware achieve better execution times than ordinary evolutionary algorithms and offer greater accessibility than […]
Jan, 23
Fast Genetic Programming and Artificial Developmental Systems on GPUs
In this paper we demonstrate the use of the graphics processing unit (GPU) to accelerate evolutionary computation applications, in particular genetic programming approaches. We show that it is possible to get speed increases of several hundred times over a typical CPU implementation, catapulting GPU processing for these applications into the realm of HPC This increase […]