Posts
Feb, 5
Accelerating Outlier Detection with Uncertain Data using Graphics Processors
Outlier detection (also known as anomaly detection) is a common data mining task in which data points that lie outside expected patterns in a given dataset are identified. This is useful in areas such as fault detection, intrusion detection and in pre-processing before further analysis. There are many approaches already in use for outlier detection, […]
Feb, 5
Efficient Computation of SOM for Outage Database
This paper describes a utilization of the Self Organizing Map (SOM) method for the analysis of power outage data. SOM, to be already used in many fields, is based on the Kohonen self-organizing neural network and it is known to capture underlying concepts. We apply this method for a unified database of power outages to […]
Feb, 5
Performance Evaluation of Particle Swarm Optimization Algorithms on GPU Using CUDA
Particle Swarm Optimization (PSO) may be easy but powerful optimization algorithm relying on the social behavior of the particles. PSO has become popular due to its simplicity and its effectiveness in wide range of application with low computational cost. The main objective of this paper is to implement a parallel Asynchronous version and Synchronous versions […]
Jan, 31
Raytracing Dynamic Scenes on GPU
Raytracing dynamic scenes at interactive rates to realtime rates has received a lot of attention recently. In this dissertation, We present a few strategies for high performance ray tracing on an off-theshelf commodity Graphics Processing Unit (GPU) traditionally used for accelerating gaming and other graphics applications. We utilize the Grid datastructure for spatially arranging the […]
Jan, 31
Decompilation of LLVM IR
Recently, in many important domains, high-level languages have become the code representations with widest platform support surpassing any low-level language in their area with respect to completeness and importance as exchange format (e.g. OpenCL for data-parallel computing, GLSL/HLSL for shader programs, JavaScript for the web). The code representations of many actively-developed compiler frameworks [JVM,LLVM,FIRM] are […]
Jan, 31
The Virtual OpenCL (VCL) Cluster Platform
Heterogeneous computing systems can dramatically increase the performance of parallel applications on clusters. Currently, applications that utilize GPU and APU devices, run their device-specific code only on devices of the same computer were the application runs. This paper presents the Virtual OpenCL (VCL) cluster platform that can run unmodified OpenCL applications transparently on clusters with […]
Jan, 31
Graphical processing unit implementation of an integrated shape-based active contour: Application to digital pathology
Commodity graphics hardware has become a cost-effective parallel platform to solve many general computational problems. In medical imaging and more so in digital pathology, segmentation of multiple structures on high-resolution images, is often a complex and computationally expensive task. Shape-based level set segmentation has recently emerged as a natural solution to segmenting overlapping and occluded […]
Jan, 31
An OpenCL implementation for the solution of TDSE on GPU and CPU architectures
Open Computing Language (OpenCL) is a parallel processing language that is ideally suited for running parallel algorithms on Graphical Processing Units (GPUs). In the present work we report the development of a generic parallel single-GPU code for the numerical solution of a system of first-order ordinary differential equations (ODEs) based on the openCL model. We […]
Jan, 30
Algorithmic Contributions to the Theory of Regular Chains
Regular chains, introduced about twenty years ago, have emerged as one of the major tools for solving polynomial systems symbolically. In this thesis, we focus on different algorithmic aspects of the theory of regular chains, from theoretical questions to high-performance implementation issues. The inclusion test for saturated ideals is a fundamental problem in this theory. […]
Jan, 30
Fast CT Image Processing using Parallelized Non-local Means
Reducing the radiation dose delivered to patients has been an important concern since the introduction of X-ray computed tomography (CT). However, low-dose CT images tend to be severely degraded by noise. This paper proposes using parallelized non-local means (PNM) under a computation framework for improving low-dose X-ray CT images. For the proposed PNM method, the […]
Jan, 30
Numerical Ocean Modeling and Simulation with CUDA
ROMS is software that models and simulates an ocean region using a finite difference grid and time stepping. ROMS simulations can take from hours to days to complete due to the compute-intensive nature of the software. As a result, the size and resolution of simulations are constrained by the performance limitations of modern computing hardware. […]
Jan, 30
On CUDA implementation of a multichannel room impulse response reshaping algorithm based on p-norm optimization
By using room impulse response shortening and shaping it is possible to reduce the reverberation effects and therefore improve speech intelligibility. This may be achieved by a prefilter that modifies the overall impulse response to have a stronger attenuation. For achieving a spatial robustness, multichannel approaches have been proposed. Unfortunately, these approaches suffer from a […]