Posts
Nov, 12
Multithread Content Based File Chunking System in CPU-GPGPU Heterogeneous Architecture
The fast development of Graphics Processing Unit (GPU) leads to the popularity of General-purpose usage of GPU (GPGPU). So far, most modern computers are CPU-GPGPU heterogeneous architecture and CPU is used as host processor. In this work, we promote a multithread file chunking prototype system, which is able to exploit the hardware organization of the […]
Nov, 12
Creating HW/SW co-designed MPSoPC’s from high level programming models
FPGA densities have continued to follow Moore’s law and can now support a complete multiprocessor system on programmable chip. The benefits of the FPGA include the ability to build a customized MPSoC system consisting of heterogeneous processing resources, interconnects and memory hierarchies that best match the requirements of each application. In this paper we outline […]
Nov, 12
A translator framework for Dynamic Programming problems
The advent of multicore systems, joined to the potential acceleration of the graphics processing units, has given us a low cost computation capability unprecedented. The new systems alleviate some well known important architectural problems at the expense of a considerable increment of the programmability wall. The heterogeneity, both at architectural and programming level at the […]
Nov, 12
Compiling for a heterogeneous vector image processor
We present a new compilation strategy, implemented at a small cost, to optimize image applications developed on top of a high level image processing library for an heterogeneous processor with a vector image processing accelerator. The library provides the semantics of the image computations. The pipelined structure of the accelerator allows to compute whole expressions […]
Nov, 12
Safe Asynchronous Multicore Memory Operations
Asynchronous memory operations provide a means for coping with the memory wall problem in multicore processors, and are available in many platforms and languages, e.g., the Cell Broadband Engine, CUDA and OpenCL. Reasoning about the correct usage of such operations involves complex analysis of memory accesses to check for races. We present a method and […]
Nov, 11
Performance Analysis and Benchmarking of the Intel SCC
There has been a continuous change over the past years in CPU design and development towards both power-aware hardware architectures as well as many-core processors. The Intel Single-chip Cloud Computer (SCC) combines those two trends. It is an experimental prototype created by Intel Labs consisting of 48 Pentium cores. The SCC is a highly configurable […]
Nov, 11
Building a Real-Time Multi-GPU Platform: Robust Real-Time Interrupt Handling Despite Closed-Source Drivers
Architectures in which multicore chips are augmented with graphics processing units (GPUs) have great potential in many domains in which computationally intensive real-time workloads must be supported. However, unlike standard CPUs, GPUs are treated as I/O devices and require the use of interrupts to facilitate communication with CPUs. Given their disruptive nature, interrupts must be […]
Nov, 11
Many-body quantum chemistry on graphics processing units
Heterogeneous nodes composed of a multicore CPU and at least one graphics processing unit (GPU) are increasingly common in high-performance scientific computing, and significant programming effort is currently being undertaken to port existing scientific algorithms to these unique architectures. We present implementations for two many-body quantum chemistry methods on heterogeneous nodes: the coupled-cluster with single […]
Nov, 11
Accelerating the Smoldyn Spatial Stochastic Biochemical Reaction Network Simulator Using GPUs
Smoldyn is a spatio-temporal biochemical reaction network simulator. It belongs to a class of methods called particle-based methods and is capable of handling effects such as molecular crowding. Individual molecules are modelled as point objects that can diffuse and react in a control volume. Since each molecule has to be simulated individually, the computational complexity […]
Nov, 11
Improving GPU Performance via Large Warps and Two-Level Warp Scheduling
Due to their massive computational power, graphics processing units (GPUs) have become a popular platform for executing general purpose parallel applications. GPU programming models allow the programmer to create thousands of threads, each executing the same computing kernel. GPUs exploit this parallelism in two ways. First, threads are grouped into fixed-size SIMD batches known as […]
Nov, 11
An Interest Point Based Illumination Condition Matching Approach to Photometric Registration Within Augmented Reality Worlds
With recent and continued increases in computing power, and advances in the field of computer graphics, realistic augmented reality environments can now offer inexpensive and powerful solutions in a whole range of training, simulation and leisure applications. One key challenge to maintaining convincing augmentation, and therefore user immersion, is ensuring consistent illumination conditions between virtual […]
Nov, 11
Synthetic Aperture Beamformation using the GPU
A synthetic aperture ultrasound beamformer is implemented for a GPU using the OpenCL framework. The implementation supports beamformation of either RF signals or complex baseband signals. Transmit and receive apodization can be either parametric or dynamic using a fixed F-number, a reference, and a direction. Images can be formed using an arbitrary number of emissions […]