11855

Posts

Apr, 6

Optimizing Krylov Subspace Solvers on Graphics Processing Units

Krylov subspace solvers are often the method of choice when solving sparse linear systems iteratively. At the same time, hardware accelerators such as graphics processing units (GPUs) continue to offer significant floating point performance gains for matrix and vector computations through easy-to-use libraries of computational kernels. However, as these libraries are usually composed of a […]
Apr, 6

A Low-Power Hybrid CPU-GPU Sort

This thesis analyses the energy efficiency of a low-power CPU-GPU hybrid architecture. We evaluate the NVIDIA Ion architecture, which couples an Intel Atom low power processor with an integrated GPU that has an order of magnitude fewer processors compared to traditional discrete GPUs. We attempt to create a system that balances computation and I/O capabilities […]
Apr, 6

High Performance Computing for Large Graphs of Internet Applications using GPU

The high speed CPU based routers currently in use could not handle the massive data required for real-time multimedia communication. Graphics processing units (GPUs) offer an appreciable alternative due to high computation power which results from their parallel execution units. This paper presents the implementation of the Dijkstra’s link state IP routing algorithm using GPU. […]
Apr, 6

GPU Accelerated Fractal Image Compression for Medical Imaging in Parallel Computing Platform

In this paper, we implemented both sequential and parallel version of fractal image compression algorithms using CUDA (Compute Unified Device Architecture) programming model for parallelizing the program in Graphics Processing Unit for medical images, as they are highly similar within the image itself. There are several improvement in the implementation of the algorithm as well. […]
Apr, 6

Parallel Support Vector Machines in Practice

In this paper, we evaluate the performance of various parallel optimization methods for Kernel Support Vector Machines on multicore CPUs and GPUs. In particular, we provide the first comparison of algorithms with explicit and implicit parallelization. Most existing parallel implementations for multi-core or GPU architectures are based on explicit parallelization of Sequential Minimal Optimization (SMO) […]
Apr, 4

CUDA programs for GPU computing of Swendsen-Wang multi-cluster spin flip algorithm: 2D and 3D Ising, Potts, and XY models

We present sample CUDA programs for the GPU computing of the Swendsen-Wang multi-cluster spin flip algorithm. We deal with the classical spin models; the Ising model, the q-state Potts model, and the classical XY model. As for the lattice, both the 2D (square) lattice and the 3D (simple cubic) lattice are treated. We already reported […]
Apr, 4

An efficient GPU acceptance-rejection algorithm for the selection of the next reaction to occur for Stochastic Simulation Algorithms

MOTIVATION: The Stochastic Simulation Algorithm (SSA) has largely diffused in the field of systems biology. This approach needs many realizations for establishing statistical results on the system under study. It is very computationnally demanding, and with the advent of large models this burden is increasing. Hence parallel implementation of SSA are needed to address these […]
Apr, 4

Irradiation Instability at the Inner Edges of Accretion Disks

An instability can potentially operate in highly irradiated disks where the disk sharply transitions from being radially transparent to opaque (the ‘transition region’). Such conditions may exist at the inner edges of transitional disks around T Tauri stars and accretion disks around AGNs. We derive the criterion for this instability, which we term the ‘irradiation […]
Apr, 4

Anomalous metastability in a temperature-driven transition

Langer theory of metastability provides a description of the lifetime and properties of the metastable phase of the Ising model field-driven transition, describing the magnetic field-driven transition in ferromagnets and the chemical potential-driven transition of fluids. An immediate further step is to apply it to the study of a transition driven by the temperature, as […]
Apr, 4

Petascale computations for Large-scale Atomic and Molecular collisions

Petaflop architectures are currently being utilized efficiently to perform large scale computations in Atomic, Molecular and Optical Collisions. We solve the Schroedinger or Dirac equation for the appropriate collision problem using the R-matrix or R-matrix with pseudo-states approach. We briefly outline the parallel methodology used and implemented for the current suite of Breit-Pauli and DARC […]
Apr, 4

2014 5th International Conference on Computer and Computational Intelligence, ICCCI 2014

2014-09-01 Selected submission paper will be recommended to publish into one of the journals below: Journal of Computers (JCP, ISSN: 1796-203X) –EI Compendex; SCOPUS; Journal of Software (JSW, ISSN: 1796-217X,) –EI Compendex; SCOPUS; International Journal of Machine Learning and Computing (IJMLC, ISSN: 2010-3700; DOI: 10.7763/IJMLC) –Engineering & Technology Digital Library, Google Scholar, Crossref, ProQuest, Electronic […]
Apr, 3

Experiments with Massively Parallel Matrix Multiplication

This paper presents initial experiments in implementing two notable matrix multiplication algorithms – the DNS algorithm and Cannon’s algorithm – using NVIDIA’s general-purpose graphics processing units (GPGPUs) and CUDA development platform. We demonstrate that these implementations are comparable with traditional methods in terms of computational expense and may scale better than traditional techniques.

* * *

* * *

HGPU group © 2010-2024 hgpu.org

All rights belong to the respective authors

Contact us: