5638

Posts

Sep, 13

Software Challenges for Extreme Scale Computing: Going From Petascale to Exascale Systems

Preparing applications for a transition from petascale to exascale systems will require a very large investment in several areas of software research and development. The introduction of manycore nodes, the abundance of parallelism, an increase in system faults (including soft errors) and a complicated, multi-component software environment are some of the most challenging issues we […]
Sep, 12

Energy-efficient computing for extreme-scale science

A many-core processor design for high-performance systems draws from embedded computing’s low-power architectures and design processes, providing a radical alternative to cluster solutions. The computational power required to accurately model extreme problem spaces, such as climate change, requires more than a business-as-usual approach. Building ever-larger clusters of commercial off-the-shelf (COTS) hardware will be increasingly constrained […]
Sep, 12

Intermediate fabrics: virtual architectures for circuit portability and fast placement and routing

Although hardware/software partitioning of embedded applications onto FPGAs is widely known to have performance and power advantages, FPGA usage has been typically limited to hardware experts, due largely to several problems: 1) difficulty of integrating hardware design tools into well-established software tool flows, 2) increasingly lengthy FPGA design iterations due to placement and routing, and […]
Sep, 12

Task superscalar: using processors as functional units

The complexity of parallel programming greatly limits the effectiveness of chip-multiprocessors (CMPs). This paper presents the case for task superscalar pipelines, an abstraction of traditional out-of-order superscalar pipelines, that orchestrates an entire chip-multiprocessor in the same degree out-of-order pipelines manage functional units. Task superscalar leverages an emerging class of task-based dataflow programming models to relieve […]
Sep, 12

Meta-simulation of large WSN on multi-core computers

With the advances in wireless communications large scale Wireless Sensor Networks (WSN) are emerging with many applications. These networks are deployed to serve single objective application, with high optimization requirements such as performance enhancement and power saving. Application specific optimization is achieved using formal models and evaluation based simulations of distributed algorithms (DA) controlling such […]
Sep, 12

A scripting language for Digital Content Creation applications

Digital Content Creation (DCC) Applications (e.g. Blender, Autodesk 3ds Max) have long been used for the creation and editing of digital content. Due to current advancement in the field, the need for controlled automated work forced these applications to add support for small programming languages that gave power to artists without diving into many details. […]
Sep, 12

Symbolic crosschecking of floating-point and SIMD code

We present an effective technique for crosschecking an IEEE 754 floating-point program and its SIMD-vectorized version, implemented in KLEE-FP, an extension to the KLEE symbolic execution tool that supports symbolic reasoning on the equivalence between floating-point values. The key insight behind our approach is that floatingpoint values are only reliably equal if they are essentially […]
Sep, 12

Implementing cartesian genetic programming classifiers on graphics processing units using GPU.NET

This paper investigates the use of a new Graphics Processing Unit (GPU) programming tool called ‘GPU.NET’ for implementing a Genetic Programming fitness evaluator. We find that the tool is able to help write software that accelerates fitness evaluation. For the first time, Cartesian Genetic Programming (CGP) was used with a GPU-based interpreter. With its code […]
Sep, 12

Optimizing a shared virtual memory system for a heterogeneous CPU-accelerator platform

The client computing platform is moving towards a heterogeneous architecture that combines scalar-oriented CPU cores and throughput-oriented accelerator cores. Recognizing that existing programming models for such heterogeneous platforms are still difficult for most programmers, we advocate a shared virtual memory programming model to improve programmability. In this paper, we focus on performance, and demonstrate that […]
Sep, 12

Simulating vehicle kinematics with SimVis3D and Newton

This paper discusses the simulation of vehicle kinematics with SimVis3D and the Newton Game Dynamics Engine. As running example a Pioneer like robot is used. First its differential drive is simulated manually, without physical effects. Then the dynamics engine is applied to simulate this drive mechanism based on rolling friction between wheels and ground. Comparing […]
Sep, 12

High-performance Computing in China: Research and Applications

In this report we review the history of high-performance computing (HPC) system development and applications in China and describe the current status of major government programs, HPC centers and facilities, major research institutions, important HPC application fields, and domestic vendor activities. A comparison between China and developed nations, such as the United States, European countries, […]
Sep, 11

Unified Tables for Exponential and Logarithm Families

Accurate table methods allow for very accurate and efficient evaluation of elementary functions. We present new single-table approaches to logarithm and exponential evaluation, by which we mean that a single table of values works for both log(x) and log(1+x), and a single table for ex and ex-1. This approach eliminates special cases normally required to […]

* * *

* * *

HGPU group © 2010-2025 hgpu.org

All rights belong to the respective authors

Contact us:

contact@hpgu.org