Posts
Jul, 27
Implementation of 2-D Discrete Cosine Transform Algorithm on GPU
Discrete Cosine Transform (DCT) is a technique to get frequency separation. When DCT is applied on an image, it will give frequency segregation of an image since it is composed of DC value and range of low frequency values to high frequency values. DCT is very useful in image compression. When high frequency values are […]
Jul, 27
Fast Image Processing with Embedded Microprocessors
This Thesis intends to be a startup guide in understanding the basics of Image Processing techniques and common use cases, but at the same time take advantage of the Graphics Processing Unit available in today’s embedded multimedia microprocessors present in netbooks, smartphones and tablets.
Jul, 27
GPU Parallel Algorithms for Reporting Movement Behaviour Patterns in Spatiotemporal Databases
Mobility is a key element of many processes and activities, and the understanding of movement is important in many areas of science and technology. With the recent advances in technologies for mobile devices, like GPS and mobile phones, we are able to generate data sets of people, animals, vehicles and other moving objects, normally available […]
Jul, 25
A unified sparse matrix data format for modern processors with wide SIMD units
Sparse matrix-vector multiplication (spMVM) is the most time-consuming kernel in many numerical algorithms and has been studied extensively on all modern processor and accelerator architectures. However, the optimal sparse matrix data storage format is highly hardware-specific, which could become an obstacle when using heterogeneous systems. Also, it is as yet unclear how the wide single […]
Jul, 25
Parallel birth and death process for cell nuclei extraction in histopathology images
Cell nuclei extraction from histopathology images is necessary for breast cancer grading, and has become one of the major problem in the domain of automatic image analysis. Stochastic marked point processes combined with birth and death processes are promising tools for such extraction, but they are extremely compute intensive, especially on large images such as […]
Jul, 25
Characterization of Lossy SIW Resonators Based on Multilayer Perceptron Neural Networks on Graphics Processing Unit
In recent years, Artificial Neural networks (ANNs) have been intensively employed to build smart model of microwave devices. In this paper a characterization of lossy SIW resonators by means of Multilayer Perceptron Neural Networks (MLPNNs) on Graphics Processing Unit (GPU), is presented. Once properly selected and trained, a MLPNN can evaluate the lossy SIW resonator’s […]
Jul, 25
Octree Light Propagation Volumes
This paper presents a new method for representing Light Propagation Volumes using an octree data structure, and for allowing light from regular point light sources to be injected into them. The resulting technique uses full octrees with the help of a separate data structure for representing the octree structure. The octree structure enables light propagation […]
Jul, 25
From Physics Model to Results: An Optimizing Framework for Cross-Architecture Code Generation
Starting from a high-level problem description in terms of partial differential equations using abstract tensor notation, the Chemora framework discretizes, optimizes, and generates complete high performance codes for a wide range of compute architectures. Chemora extends the capabilities of Cactus, facilitating the usage of large-scale CPU/GPU systems in an efficient manner for complex applications, without […]
Jul, 25
Efficient Rendering of Scenes with Dynamic Lighting Using a Photons Queue and Incremental Update Algorithm
Photon mapping is a popular extension to the classic ray tracing algorithm in the field of realistic image synthesis. Moreover, it benefits from the massive parallelism computational power brought by recent developments in graphics processor hardware and programming models. However rendering the scenes with dynamic lights still greatly limits the performance due to the re-construction […]
Jul, 25
Modeling of Heterogeneous Architecture with GPU to Exascale System
The High-Performance Computing (HPC) community aimed for many years at increasing performance regardless of energy consumption. However, energy is limiting the scalability of the next generation of supercomputers. Current HPC systems already consume huge amounts of power, in the order of a few MegaWatts (MW). The future HPC systems intend to achieve 10 to 100 […]
Jul, 25
Fast Exhaustive Search for Quadratic Systems in F2 on FPGAs – Extended Version
In 2010, Bouillaguet et al. proposed an efficient solver for polynomial systems over $mathbb{F}_2$ that trades memory for speed. As a result, 48 quadratic equations in 48 variables can be solved on a graphics card (GPU) in 21 minutes. The research question that we would like to answer in this paper is how specifically designed […]
Jul, 25
High Performance Implementation of Ultrasound Color Doppler Imaging on GPU platform
The ability to detect and assess information of blood flow in color Doppler imaging (CDI) has played an important role in a modern ultrasound imaging system. However, it has been mainly implemented on custom-designed hardware due to large amount of data and computations. Recent trend of programmable approach offers the advantages of flexibility and quick […]