Posts
Mar, 3
The sparse matrix vector product on GPUs
The sparse matrix vector product (SpMV) is a paramount operation in engineering and scientific computing and, hence, has been a subject of intense research for long. The irregular computations involved in SpMV make its optimization challenging. Therefore, enormous effort has been devoted to devise data formats to store the sparse matrix with the ultimate aim […]
Mar, 3
Unified – A Sharp Turn in the Latest Era of Graphic Processors
The need of high performance and realism has increased a lot in the last few decades, especially in gaming, 3D graphics and computationally demanding applications. It has compelled the GPU vendors to put their best effort towards the improvement of ILP (Instruction Level Parallelism). As a result of which, the GPU has entered in a […]
Mar, 3
Building Correlators with Many-Core Hardware
Radio telescopes typically consist of multiple receivers whose signals are cross-correlated to filter out noise. A recent trend is to correlate in software instead of custom-built hardware, taking advantage of the flexibility that software solutions offer. Examples include e-VLBI and LOFAR. However, the data rates are usually high and the processing requirements challenging. Many-core processors […]
Mar, 3
RankBoost Acceleration on both NVIDIA CUDA and ATI Stream Platforms
NVIDIA CUDA and ATI Stream are the two major general-purpose GPU (GPGPU) computing technologies. We implemented RankBoost, a web relevance ranking algorithm, on both NVIDIA CUDA and ATI Stream platforms to accelerate the algorithm and illustrate the differences between these two technologies. It shows that the performances of GPU programs are highly dependent on the […]
Mar, 3
Parallel Cycle Based Logic Simulation Using Graphics Processing Units
Graphics Processing Units (GPUs) are gaining popularity for parallelization of general purpose applications. GPUs are massively parallel processors with huge performance in a small and readily available package. At the same time, the emergence of general purpose programming environments for GPUs such as CUDA shorten the learning curve of GPU programming. We present a GPU-based […]
Mar, 3
Speeding Up Cycle Based Logic Simulation Using Graphics Processing Units
Verification has grown to dominate the cost of electronic system design, consuming about 60% of design effort. Among several verification techniques, logic simulation remains the major verification technique. Speeding up logic simulation results in great savings and shorter time-to-market. We parallelize logic simulation using Graphics Processing Units (GPUs). In the past, GPUs were special-purpose application […]
Mar, 3
Real-time dynamic tone-mapping operator on GPU
This article presents the parallel implementation on a GPU of a real-time dynamic tone-mapping operator. The operator we describe in this article is generic and may be used by any application. However, the goal of our work is to integrate this operator into the graphic rendering process of a car driving simulator; thus, we studied […]
Mar, 3
Singular value decomposition for collaborative filtering on a GPU
A collaborative filtering predicts customers’ unknown preferences from known preferences. In a computation of the collaborative filtering, a singular value decomposition (SVD) is needed to reduce the size of a large scale matrix so that the burden for the next phase computation will be decreased. In this application, SVD means a roughly approximated factorization of […]
Mar, 2
7th International Workshop on OpenMP, IWOMP 2011
The International Workshop on OpenMP (IWOMP) is an annual workshop dedicated to the promotion and advancement of all aspects of parallel programming with OpenMP. It is the premier forum to present and discuss issues, trends, recent research ideas and results related to parallel programming with OpenMP. The international workshop affords an opportunity for OpenMP users […]
Mar, 2
FluoroSim: A Visual Problem-Solving Environment for Fluorescence Microscopy
Fluorescence microscopy provides a powerful method for localization of structures in biological specimens. However, aspects of the image formation process such as noise and blur from the microscope’s point-spread function combine to produce an unintuitive image transformation on the true structure of the fluorescing molecules in the specimen, hindering qualitative and quantitative analysis of even […]
Mar, 2
ECC2K-130 on NVIDIA GPUs
A major cryptanalytic computation is currently underway on multiple platforms, including standard CPUs, FPGAs, PlayStations and Graphics Processing Units (GPUs), to break the Certicom ECC2K-130 challenge. This challenge is to compute an elliptic-curve discrete logarithm on a Koblitz curve over F2131. Optimizations have reduced the cost of the computation to approximately 2^77 bit operations in […]
Mar, 2
Accelerating Statistical Static Timing Analysis Using Graphics Processing Units
In this paper, we explore the implementation of Monte Carlo based statistical static timing analysis (SSTA) on a graphics processing unit (GPU). SSTA via Monte Carlo simulations is a computationally expensive, but important step required to achieve design timing closure. It provides an accurate estimate of delay variations and their impact on design yield. The […]