Posts
Aug, 28
A new multi-core pipelined architecture for executing sequential programs for parallel geospatial computing
Parallel programming on multi-core processors has become the industry’s biggest software challenge. This paper proposes a novel parallel architecture for executing sequential programs using multi-core pipelining based on program slicing by a new memory/cache dynamic management technology. The new architecture is very suitable for processing large geospatial data in parallel without parallel programming. This paper […]
Aug, 28
Automating GPU computing in MATLAB
MATLAB is a popular software platform for scientific and engineering software writers. It offers a high level of abstraction for fundamental mathematical operations and extensive highly optimized domain-specific libraries for several scientific and engineering disciplines. With the recent availability of GPU libraries for MATLAB, it has become possible to easily exploit GPGPUs as coprocessors. However, […]
Aug, 28
CUDA accelerated iris template matching on Graphics Processing Units (GPUs)
In this paper we develop a parallelized iris template matching implementation on inexpensive Graphics Processing Units (GPUs) with Nvidia’s CUDA programming model to achieve matching rates of 44 million iris template comparisons per second without rotation invariance. With tolerance to head tilt, we achieve 4.2 million matches per second and compare our implementation to state […]
Aug, 28
Real-time task reconfiguration support applied to an UAV-based surveillance system
Modern surveillance systems, such as those based on the use of unmanned aerial vehicles, required powerful high-performance platforms to deal with many different algorithms that make use of massive calculations. At the same time, low-cost and high-performance specific hardware (e.g., GPU, PPU) are rising and the CPUs turned to multiple cores, characterizing together an interesting […]
Aug, 28
Collaborative execution environment for heterogeneous parallel systems
Nowadays, commodity computers are complex heterogeneous systems that provide a huge amount of computational power. However, to take advantage of this power we have to orchestrate the use of processing units with different characteristics. Such distributed memory systems make use of relatively slow interconnection networks, such as system buses. Therefore, most of the time we […]
Aug, 28
Real-time photo style transfer
This paper presents a novel approach for real-time photo style transfer. The automatic image manipulation technique is performed in the oRGB color space, which is a new color model based on the psychologically opponent color theory. We transfer color from an appropriate source image to the target image using a simple statistical analysis. In addition, […]
Aug, 28
Global Illumination for Advanced Computer Graphics
Real-time 3D graphics is present today on various devices, from high-end PC powered by highly complex GPUs to more simple handheld consoles or mobile phones. All these solutions are based on an aging technique called immediate mode rasterization, very efficient for rendering simple scenes, but unable to capture essential visual features such as soft shadows, […]
Aug, 27
Switching to High Gear: Opportunities for Grand-Scale Real-Time Parallel Simulations
The recent emergence of dramatically large computational power, spanning desktops with multi-core processors and multiple graphics cards to supercomputers with 105 processor cores, has suddenly resulted in simulation-based solutions trailing behind in the ability to fully tap the new computational capacity. Here, we motivate the need for switching the parallel simulation research to a higher […]
Aug, 27
directCell: hybrid systems with tightly coupled accelerators
The Cell Broadband Engine (Cell/B.E.) processor is a hybrid IBM PowerPC processor. In blade servers and PCI Express card systems, it has been used primarily in a server context, with Linux as the operating system. Because neither Linux as an operating system nor a PowerPC processor-based architecture is the preferred choice for all applications, some […]
Aug, 27
A breadth-first course in multicore and manycore programming
The technique of scaling hardware performance through increasing the number of cores on a chip requires programmers to learn to write parallel code that can exploit this hardware. In order to expose students to a variety of multicore programming models, our university offered a breadth-first introduction to multicore and manycore programming for upper-level undergraduates. Our […]
Aug, 27
A structured parallel periodic arnoldi shooting algorithm for RF-PSS analysis based on GPU platforms
The recent multi/many-core CPUs or GPUs have provided an ideal parallel computing platform to accelerate the time-consuming analysis of radio-frequency/millimeter-wave (RF/MM) integrated circuit (IC). This paper develops a structured shooting algorithm that can fully take advantage of parallelism in periodic steady state (PSS) analysis. Utilizing periodic structure of the state matrix of RF/MM-IC simulation, a […]
Aug, 27
Automatic contention detection and amelioration for data-intensive operations
To take full advantage of the parallelism offered by a multi-core machine, one must write parallel code. Writing parallel code is difficult. Even when one writes correct code, there are numerous performance pitfalls. For example, an unrecognized data hotspot could mean that all threads effectively serialize their access to the hotspot, and throughput is dramatically […]