Posts
Oct, 22
Neurokernel: An Open Source Platform for Emulating the Fruit Fly Brain
We have developed an open software platform called Neurokernel for collaborative development of comprehensive models of the brain of the fruit fly Drosophila melanogaster and their execution and testing on multiple Graphics Processing Units (GPUs). Neurokernel provides a programming model that capitalizes upon the structural organization of the fly brain into a fixed number of […]
Oct, 18
Implementation of a Power Efficient Synthetic Aperture Radar Back Projection Algorithm on FPGAs Using OpenCL
In this thesis, an implementation of a Synthetic Aperture Radar (SAR) back projection algorithm onto a Field-Programmable Gate Array (FPGA) device using Open Computing Language (OpenCL) is developed. SAR back projection is a method to form a high-resolution terrain image from radar data. SAR is used in many applications such as Geographic Information Systems (GIS), […]
Oct, 18
A Network Intrusion Detection System Framework based on Hadoop and GPGPU
In IT industry the business data grows exponentially, which results in concern to enhance the security system by implementing effective NIDS (Network Intrusion Detection System).The quick response to detecting intrusion an essential feature of any NIDS system, but due to the huge amount of data obtained from organizations which impacts the performance of NIDS. The […]
Oct, 18
Performance analysis and optimization of a CFD application
This thesis documents the analysis and optimization of a high-order finite difference computational fluid dynamics (CFD) application (PlasComCM). Performance bottlenecks were identified using performance tools and hardware counters. The performance analysis of PlasComCM showed that the quantity of memory accesses and the lack of vectorization inhibited optimal serial performance on a x86-based CPU. Optimizing techniques […]
Oct, 18
MetaFork: A Compilation Framework for Concurrency Models Targeting Hardware Accelerators and Its Application to the Generation of Parametric CUDA Kernels
In this paper, we present the accelerator model of MetaFork together with the software framework that allows automatic generation of CUDA code from annotated MetaFork programs. One of the key features of this CUDA code generator is that it supports the generation of CUDA kernel code where program parameters (like number of threads per block) […]
Oct, 18
Self-Adapting Parallel Framework for Long-Term Object Tracking
Object tracking is a crucial field in computer vision that has many uses in human-computer interaction, security and surveillance, video communication and compression, augmented reality, traffic control, etc. Many implementations are introduced in practice, and yet recent methods emphasize on tracking objects adaptively by learning the object’s perspectives and rediscovering it when it becomes untraceable, […]
Oct, 16
Sapporo2: A versatile direct N-body library
Astrophysical direct $N$-body methods have been one of the first production algorithms to be implemented using NVIDIA’s CUDA architecture. Now, almost seven years later, the GPU is the most used accelerator device in astronomy for simulating stellar systems. In this paper we present the implementation of the Sapporo2 $N$-body library, which allows researchers to use […]
Oct, 16
Multi-dimensional Functional Principal Component Analysis
Functional principal component analysis is one the most commonly employed approaches in functional/longitudinal data analysis and we extend it to conduct $d$-dimensional functional/longitudinal data analysis. The computational issues emerging in the extension are fully addressed with our proposed solutions. The local linear smoothing technique is employed to perform estimation because of its capabilities of performing […]
Oct, 16
Density-based parallel skin lesion border detection with webCL
BACKGROUND: Dermoscopy is a highly effective and noninvasive imaging technique used in diagnosis of melanoma and other pigmented skin lesions. Many aspects of the lesion under consideration are defined in relation to the lesion border. This makes border detection one of the most important steps in dermoscopic image analysis. In current practice, dermatologists often delineate […]
Oct, 16
Comparison of Thread Execution Methods for GPU-oriented OpenCL Programs on Multicore Processors
With the broad deployment of multicore processors, there are increasing demands to port OpenCL programs written for GPUs onto the multicore processors. However, OpenCL programs written for GPUs cannot run efficiently on multicore processors since GPU-oriented OpenCL programs generally consist of a huge number of threads. This paper presents experimental comparisons of three thread execution […]
Oct, 16
A progressive mesh method for physical simulations using lattice Boltzmann method on single-node multi-gpu architectures
In this paper, a new progressive mesh algorithm is introduced in order to perform fast physical simulations by the use of a lattice Boltzmann method (LBM) on a single-node multi-GPU architecture. This algorithm is able to mesh automatically the simulation domain according to the propagation of fluids. This method can also be useful in order […]
Oct, 13
Accelerating Applications with Pattern-specific Optimizations on Accelerators and Coprocessors
Because of the bottleneck in the increase of clock frequency, multi-cores emerged as a way of improving the overall performance of CPUs. In the recent decade, many-cores begin to play a more and more important role in scientific computing. The highly cost-effective nature of many-cores makes them extremely suitable for data-intensive computations. Specifically, many-cores are […]