Posts
Sep, 12
Intermediate fabrics: virtual architectures for circuit portability and fast placement and routing
Although hardware/software partitioning of embedded applications onto FPGAs is widely known to have performance and power advantages, FPGA usage has been typically limited to hardware experts, due largely to several problems: 1) difficulty of integrating hardware design tools into well-established software tool flows, 2) increasingly lengthy FPGA design iterations due to placement and routing, and […]
Sep, 12
Task superscalar: using processors as functional units
The complexity of parallel programming greatly limits the effectiveness of chip-multiprocessors (CMPs). This paper presents the case for task superscalar pipelines, an abstraction of traditional out-of-order superscalar pipelines, that orchestrates an entire chip-multiprocessor in the same degree out-of-order pipelines manage functional units. Task superscalar leverages an emerging class of task-based dataflow programming models to relieve […]
Sep, 12
Meta-simulation of large WSN on multi-core computers
With the advances in wireless communications large scale Wireless Sensor Networks (WSN) are emerging with many applications. These networks are deployed to serve single objective application, with high optimization requirements such as performance enhancement and power saving. Application specific optimization is achieved using formal models and evaluation based simulations of distributed algorithms (DA) controlling such […]
Sep, 12
A scripting language for Digital Content Creation applications
Digital Content Creation (DCC) Applications (e.g. Blender, Autodesk 3ds Max) have long been used for the creation and editing of digital content. Due to current advancement in the field, the need for controlled automated work forced these applications to add support for small programming languages that gave power to artists without diving into many details. […]
Sep, 12
Symbolic crosschecking of floating-point and SIMD code
We present an effective technique for crosschecking an IEEE 754 floating-point program and its SIMD-vectorized version, implemented in KLEE-FP, an extension to the KLEE symbolic execution tool that supports symbolic reasoning on the equivalence between floating-point values. The key insight behind our approach is that floatingpoint values are only reliably equal if they are essentially […]
Sep, 12
Implementing cartesian genetic programming classifiers on graphics processing units using GPU.NET
This paper investigates the use of a new Graphics Processing Unit (GPU) programming tool called ‘GPU.NET’ for implementing a Genetic Programming fitness evaluator. We find that the tool is able to help write software that accelerates fitness evaluation. For the first time, Cartesian Genetic Programming (CGP) was used with a GPU-based interpreter. With its code […]
Sep, 12
Optimizing a shared virtual memory system for a heterogeneous CPU-accelerator platform
The client computing platform is moving towards a heterogeneous architecture that combines scalar-oriented CPU cores and throughput-oriented accelerator cores. Recognizing that existing programming models for such heterogeneous platforms are still difficult for most programmers, we advocate a shared virtual memory programming model to improve programmability. In this paper, we focus on performance, and demonstrate that […]
Sep, 12
Simulating vehicle kinematics with SimVis3D and Newton
This paper discusses the simulation of vehicle kinematics with SimVis3D and the Newton Game Dynamics Engine. As running example a Pioneer like robot is used. First its differential drive is simulated manually, without physical effects. Then the dynamics engine is applied to simulate this drive mechanism based on rolling friction between wheels and ground. Comparing […]
Sep, 12
High-performance Computing in China: Research and Applications
In this report we review the history of high-performance computing (HPC) system development and applications in China and describe the current status of major government programs, HPC centers and facilities, major research institutions, important HPC application fields, and domestic vendor activities. A comparison between China and developed nations, such as the United States, European countries, […]
Sep, 11
Unified Tables for Exponential and Logarithm Families
Accurate table methods allow for very accurate and efficient evaluation of elementary functions. We present new single-table approaches to logarithm and exponential evaluation, by which we mean that a single table of values works for both log(x) and log(1+x), and a single table for ex and ex-1. This approach eliminates special cases normally required to […]
Sep, 11
Translation-invariant two-dimensional discrete wavelet transform on graphics processing units
The Discrete Wavelet Transform (DWT) is used in several signal and image processing applications. Due to the computational expense various approaches have been proposed. One approach is using graphics processing units (GPUs) as stream processors to speed up the calculation of the DWT. This paper presents a GPU implementation of the translation-invariant wavelet transform computed […]
Sep, 11
SSLPV: subsurface light propagation volumes
This paper presents the Subsurface Light Propagation Volume (SSLPV) method for real-time approximation of subsurface scattering effects in dynamic scenes with changing mesh topology and lighting. SSLPV extends the Light Propagation Volume (LPV) technique for indirect illumination in video games. We introduce a new consistent method for injecting flux from point light sources into an […]