Posts
Sep, 11
Challenges of medical image processing
In todays health care, imaging plays an important role throughout the entire clinical process from diagnostics and treatment planning to surgical procedures and follow up studies. Since most imaging modalities have gone directly digital, with continually increasing resolution, medical image processing has to face the challenges arising from large data volumes. In this paper, we […]
Sep, 11
A data parallel view on polyhedral process networks
Emerging architectures in embedded space are expected to make use of a diverse mix of multicores, vector-based units, GPU cores and special function accelerators. In order to facilitate mapping onto diverse architectures, different models of computation have been considered. Polyhedral Process Networks (PPNs) have been extensively used in automatic generation of task and pipeline parallel […]
Sep, 11
High-performance SIMT code generation in an active visual effects library
SIMT (Single-Instruction Multiple-Thread) is an emerging programming paradigm for high-performance computational accelerators, pioneered in current and next generation GPUs and hybrid CPUs. We present a domain-specific active-library supported approach to SIMT code generation and optimisation in the field of visual effects. Our approach uses high-level metadata and runtime context to guide and to ensure the […]
Sep, 11
Software-based branch predication for AMD GPUs
Branch predication is a program transformation technique that combines instructions of multiple branches of an if statement into a straight-line sequence and associates each instruction of the sequence with a predicate. The branch predication improves the execution of branch statements on processors that support predicated execution of instruction, e.g., Intel IA-64, because such transformation improves […]
Sep, 11
Solving diffractive optics problems using graphics processing units
Techniques for applying graphics processing units (GPU) to the general-purpose nongraphics computations proposed in recent years by the companies ATI (AMD FireStream, 2006) and NVIDIA (CUDA: Compute Unified Device Architecture, 2007) have given an impetus to developing algorithms and software packages for solving problems of diffractive optics with the aid of the GPU. The computations […]
Sep, 9
Enabling multiple accelerator acceleration for Java/OpenMP
While using a single GPU is fairly easy, using multiple CPUs and GPUs potentially distributed over multiple machines is hard because data needs to be kept consistent using message exchange and the load needs to be balanced. We propose (1) an array package that provides partitioned and replicated arrays and (2) a compute-device library to […]
Sep, 9
Heterogeneous multicore parallel programming for graphics processing units
Hybrid parallel multicore architectures based on graphics processing units (GPUs) can provide tremendous computing power. Current NVIDIA and AMD Graphics Product Group hardware display a peak performance of hundreds of gigaflops. However, exploiting GPUs from existing applications is a difficult task that requires non-portable rewriting of the code. In this paper, we present HMPP, a […]
Sep, 9
Beyond programmable shading (parts I and II)
There are strong indications that the future of interactive graphics programming is a more flexible model than today’s OpenGL/Direct3D pipelines. Graphics developers need a basic understanding of how to combine emerging parallel programming techniques and more flexible graphics processors with the traditional interactive rendering pipeline. As the first in a series, this course introduces the […]
Sep, 9
Data classification for artificial intelligence construct training to aid in network incident identification using network telescope data
This paper considers the complexities involved in obtaining training data for use by artificial intelligence constructs to identify potential network incidents using passive network telescope data. While a large amount of data obtained from network telescopes exists, this data is not currently marked for known incidents. Problems related to this marking process include the accuracy […]
Sep, 9
A stream-computing extension to OpenMP
This paper introduces an extension to OpenMP3.0 enabling stream programming with minimal, incremental additions that seamlessly integrate into the current specification. The stream programming model decomposes programs into tasks and explicits the flow of data among them, thus exposing data, task and pipeline parallelism. It helps the programmers to express concurrency and data locality properties, […]
Sep, 9
CUDACS: securing the cloud with CUDA-enabled secure virtualization
While on the one hand unresolved security issues pose a barrier to the widespread adoption of cloud computing technologies, on the other hand the computing capabilities of even commodity HW are boosting, in particular thanks to the adoption of *-core technologies. For instance, the Nvidia Compute Unified Device Architecture (CUDA) technology is increasingly available on […]
Sep, 9
KAdvice: infering synchronization patterns from an existing codebase
Operating system kernels are complex software systems. The kernels of todays mainstream OSs, such as Linux or Windows, are composed from a number of modules, which contain code and data. Even when providing synchronous interfaces (APIs) to the programmer, large portions of the OS kernel operate in an asynchronous manner. Synchronizing access to kernel data […]