5620

Posts

Sep, 11

A design pattern language for engineering (parallel) software: merging the PLPP and OPL projects

Parallel programming is stuck. To make progress, we need to step back and understand the software people wish to engineer. We do this with a design pattern language. This paper provides background for a lively discussion of this pattern language. We present the context for the problem, the layers in the design pattern language, and […]
Sep, 11

Automatic safety proofs for asynchronous memory operations

We present a work-in-progress proof system and tool, based on separation logic, for analysing memory safety of multicore programs that use asynchronous memory operations.
Sep, 11

Lightweight modular staging: a pragmatic approach to runtime code generation and compiled DSLs

Software engineering demands generality and abstraction, performance demands specialization and concretization. Generative programming can provide both, but the effort required to develop high-quality program generators likely offsets their benefits, even if a multi-stage programming language is used. We present lightweight modular staging, a library-based multi-stage programming approach that breaks with the tradition of syntactic quasi-quotation […]
Sep, 11

Early experiences with the intel many integrated cores accelerated computing technology

We report on early programming experiences with the Intel Many Integrated Core (Intel MIC) Co-processor. This new and x86 based technology is Intel’s answer to GPU-based accelerators by NVIDIA, AMD and others. Accelerators have generally sparked interest in the HPC community because they have the potential to significantly increase the compute power of the next […]
Sep, 11

Importance-driven compositing window management

In this paper we present importance-driven compositing window management, which considers windows not only as basic rectangular shapes but also integrates the importance of the windows’ content using a bottom-up visual attention model. Based on this information, importance-driven compositing optimizes the spatial window layout for maximum visibility and interactivity of occluded content in combination with […]
Sep, 11

Task Superscalar: An Out-of-Order Task Pipeline

We present emph{Task Super scalar}, an abstraction of instruction-level out-of-order pipeline that operates at the task-level. Like ILP pipelines, which uncover parallelism in a sequential instruction stream, task super scalar uncovers task-level parallelism among tasks generated by a sequential thread. Utilizing intuitive programmer annotations of task inputs and outputs, the task super scalar pipeline dynamically […]
Sep, 11

Parallel programming with NVIDIA CUDA

Using hardware acceleration via General Programming on stock GPUs (GPGPU), I’ve sped up my algorithms by more than tenfold. This article shows how you can achieve these results too! Programmers have been interested in leveraging the highly parallel processing power of video cards to speed up applications that are not graphic in nature for a […]
Sep, 11

TimeGraph: GPU scheduling for real-time multi-tasking environments

The Graphics Processing Unit (GPU) is now commonly used for graphics and data-parallel computing. As more and more applications tend to accelerate on the GPU in multi-tasking environments where multiple tasks access the GPU concurrently, operating systems must provide prioritization and isolation capabilities in GPU resource management, particularly in real-time setups. We present TimeGraph, a […]
Sep, 11

Challenges of medical image processing

In todays health care, imaging plays an important role throughout the entire clinical process from diagnostics and treatment planning to surgical procedures and follow up studies. Since most imaging modalities have gone directly digital, with continually increasing resolution, medical image processing has to face the challenges arising from large data volumes. In this paper, we […]
Sep, 11

A data parallel view on polyhedral process networks

Emerging architectures in embedded space are expected to make use of a diverse mix of multicores, vector-based units, GPU cores and special function accelerators. In order to facilitate mapping onto diverse architectures, different models of computation have been considered. Polyhedral Process Networks (PPNs) have been extensively used in automatic generation of task and pipeline parallel […]
Sep, 11

High-performance SIMT code generation in an active visual effects library

SIMT (Single-Instruction Multiple-Thread) is an emerging programming paradigm for high-performance computational accelerators, pioneered in current and next generation GPUs and hybrid CPUs. We present a domain-specific active-library supported approach to SIMT code generation and optimisation in the field of visual effects. Our approach uses high-level metadata and runtime context to guide and to ensure the […]
Sep, 11

Software-based branch predication for AMD GPUs

Branch predication is a program transformation technique that combines instructions of multiple branches of an if statement into a straight-line sequence and associates each instruction of the sequence with a predicate. The branch predication improves the execution of branch statements on processors that support predicated execution of instruction, e.g., Intel IA-64, because such transformation improves […]

* * *

* * *

HGPU group © 2010-2025 hgpu.org

All rights belong to the respective authors

Contact us: