Posts
Nov, 5
Power Flow Analysis on CUDA-based GPU
This major qualifying project investigates the algorithm and the performance of using the CUDA-based Graphics Processing Unit for power flow analysis. The accomplished work includes the design, implementation and testing of the power flow solver. Comprehensive analysis shows that the execution time of the parallel algorithm outperforms that of the sequential algorithm by several factors.
Nov, 5
Real-time Flame Rendering with GPU and CUDA
This paper proposes a method of flame simulation based on Lagrange process and chemical composition, which was non-grid and the problems associated with there grids were overcome. The turbulence movement of flame was described by Lagrange process and chemical composition was added into flame simulation which increased the authenticity of flame. For real-time applications, this […]
Nov, 4
Accelerating a TV based JPEG decompression algorithm with Cuda
In previous works, we have have developed a mathematical model for artifact-free decompression of JPEG images. There, the problem of finding an artifact-free decompression for a given JPEG compressed image is related to a convex minimization problem. We use a primal-dual algorithm to solve this problem, for which we have developed a Matlab and C++ […]
Nov, 4
A CUDA Implementation of Independent Component Analysis in the Time-Frequency Domain
For the blind separation of convolutive mixtures, a huge processing power is required. In this paper we propose a massive parallel implementation of the Independent Component Analysis in the time-frequency domain using the processing power of the current graphics adapters within the CUDA framework. The often used approach for solving the separation task is the […]
Nov, 4
Parallelization of maximum likelihood fits with OpenMP and CUDA
Data analyses based on maximum likelihood fits are commonly used in the high energy physics community for fitting statistical models to data samples. This technique requires the numerical minimization of the negative log-likelihood function. MINUIT is the most common package used for this purpose in the high energy physics community. The main algorithm in this […]
Nov, 4
Combined acoustic and optical trapping
Combining several methods for contact free micro-manipulation of small particles such as cells or micro-organisms provides the advantages of each method in a single setup. Optical tweezers, which employ focused laser beams, offer very precise and selective handling of single particles. On the other hand, acoustic trapping with wavelengths of about 1 mm allows the […]
Nov, 4
PEPSC: A Power-Efficient Processor for Scientific Computing
The rapid advancements in the computational capabilities of the graphics processing unit (GPU) as well as the deployment of general programming models for these devices have made the vision of a desktop supercomputer a reality. It is now possible to assemble a system that provides several TFLOPs of performance on scientific applications for the cost […]
Nov, 4
Gyrofluid Modeling of Turbulent, Kinetic Physics
Gyrofluid models to describe plasma turbulence combine the advantages of fluid models, such as lower dimensionality and well-developed intuition, with those of gyrokinetics models, such as finite Larmor radius (FLR) effects. This allows gyrofluid models to be more tractable computationally while still capturing much of the physics related to the FLR of the particles. We […]
Nov, 4
Semi-Global Matching-Motivation, Developments and Applications
Since its original publication, the Semi-Global Matching (SGM) technique has been re-implemented by many researchers and companies. The method offers a very good trade off between runtime and accuracy, especially at object borders and fine structures. It is also robust against radiometric differences and not sensitive to the choice of parameters. Therefore, it is well […]
Nov, 4
Inter-cluster communication on clustered SIMD architectures
This work envisions that in the near future, GPUlike architectures will find their way to embedded systems. Accompanied by a small RISC control core, they will not merely be a hardware accelerator, but the heart of the system itself. Taking a state-of-the-art GPU, a baseline architecture is constructed with the embedded context in mind. Next, […]
Nov, 4
VASP on a GPU: application to exact-exchange calculations of the stability of elemental boron
General purpose graphical processing units (GPU’s) offer high processing speeds for certain classes of highly parallelizable computations, such as matrix operations and Fourier transforms, that lie at the heart of first-principles electronic structure calculations. Inclusion of exact-exchange increases the cost of density functional theory by orders of magnitude, motivating the use of GPU’s. Porting the […]
Nov, 4
Computing Optimal Cycle Mean in Parallel on CUDA
Computation of optimal cycle mean in a directed weighted graph has many applications in program analysis, performance verification in particular. In this paper we propose a data-parallel algorithmic solution to the problem and show how the computation of optimal cycle mean can be efficiently accelerated by means of CUDA technology. We show how the problem […]