7041

Posts

Jan, 19

Platform Characterization for Domain-Specific Computing

We believe that by adapting architectures to fit the requirements of a given application domain, we can significantly improve the efficiency of computation. To validate the idea for our application domain, we evaluate a wide spectrum of commodity computing platforms to quantify the potential benefits of heterogeneity and customization for the domain-specific applications. In particular, […]
Jan, 19

Human Re-identification System On Highly Parallel GPU and CPU Architectures

The paper presents a new approach to the human reidentification problem using covariance features. In many cases a distance operator between signatures based on generalized eigenvalues has to be computed efficiently, especially once the real-time response is expected from the system. This is a challenging problem as many procedures are computationally intensive tasks and must […]
Jan, 19

Power Profiling and Optimization for Heterogeneous Multi-Core Systems

Processing speed and energy efficiency are two of the most critical issues for computer systems. This paper presents a systematic approach for profiling the power and performance characteristics of application targeting heterogeneous multi-core computing platforms. Our approach enables rapid and automated design space exploration involving optimisation of workload distribution for systems with accelerators such as […]
Jan, 19

Asymptotic Peak Utilisation in Heterogeneous Parallel CPU/GPU Pipelines: A Decentralised Queue Monitoring Strategy

Heterogeneous parallel computing has become an unavoidable consequence of the emergence of GeneralPurpose computing on graphics processing units (GPGPU). The characteristics of a Graphics Processing Unit (GPU)-including significant memory transfer latency and complex performance characteristics-demand new approaches to ensuring that all available computational resources are geared towards optimal utilisation. This paper considers the simple case […]
Jan, 19

A Compile-Time Managed Multi-Level Register File Hierarchy

As processors increasingly become power limited, performance improvements will be achieved by rearchitecting systems with energy efficiency as the primary design constraint. While some of these optimizations will be hardware based, combined hardware and software techniques likely will be the most productive. This work redesigns the register file system of a modern throughput processor with […]
Jan, 19

Heterogeneity-Aware Resource Allocation and Scheduling in the Cloud

Data analytics are key applications running in the cloud computing environment. To improve performance and cost-effectiveness of a data analytics cluster in the cloud, the data analytics system should account for heterogeneity of the environment and workloads. In addition, it also needs to provide fairness among jobs when multiple jobs share the cluster. In this […]
Jan, 18

Parallel Optimization of Queries in XML Dataset Using GPU

As XML is playing a crucial role in web services, databases, and document processing, efficient processing of XML queries has become an important issue. On the other hand, due to the increasing number of users, high throughput of XML queries is also required to execute tens of thousands of queries in a short time. Given […]
Jan, 18

Solving the Boltzmann equation on GPUs

We show how to accelerate the direct solution of the Boltzmann equation using Graphics Processing Units (GPUs). In order to fully exploit the computational power of the GPU, we choose a method of solution which combines a finite difference discretization of the free-streaming term with a Monte Carlo evaluation of the collision integral. The efficiency […]
Jan, 18

GPU Accelerated Registration of a Statistical Shape Model of the Lumbar Spine to 3D Ultrasound Images

We present a parallel implementation of a statistical shape model registration to 3D ultrasound images of the lumbar vertebrae (L2-L4). Covariance Matrix Adaptation Evolution Strategy optimization technique, along with Linear Correlation of Linear Combination similarity metric have been used, to improve the robustness and capture range of the registration approach. Instantiation and ultrasound simulation have […]
Jan, 18

Cognitive radio network for the smart grid: Experimental system architecture, control algorithms, security, and microgrid testbed

This paper systematically investigates the novel idea of applying the next generation wireless technology, cognitive radio network, for the smart grid. In particular, system architecture, algorithms, and hardware testbed are studied. A microgrid testbed supporting both power flow and information flow is also proposed. Control strategies and security considerations are discussed. Furthermore, the concept of […]
Jan, 18

The ‘Chimera’: an off-the-shelf CPU/GPGPU/FPGA hybrid computing platform

The nature of modern astronomy means that a number of interesting problems exhibit a substantial computational bound and this situation is gradually worsening. Scientists, increasingly fighting for valuable resources on conventional high performance computing (HPC) facilities-often with a limited customizable user environment-are increasingly looking to hardware acceleration solutions. We describe here a heterogeneous CPU/GPGPU/FPGA desktop […]
Jan, 18

Investigation of Parallel Computation – MPI, CUDA and Parallel Visualization

In this manuscript, the parallel computation is investigated including reviewing different programming APIs and architectures. Two specific parallel API-MPI and CUDA C are deeply analyzed. Two sorting algorithms and a visual mathematic problem are implemented with MPI alone with performance analysis. A stable fluid dynamics simulation has been experimented with CUDA. We also present a […]

* * *

* * *

HGPU group © 2010-2024 hgpu.org

All rights belong to the respective authors

Contact us: