Posts
Jan, 19
Platform Characterization for Domain-Specific Computing
We believe that by adapting architectures to fit the requirements of a given application domain, we can significantly improve the efficiency of computation. To validate the idea for our application domain, we evaluate a wide spectrum of commodity computing platforms to quantify the potential benefits of heterogeneity and customization for the domain-specific applications. In particular, […]
Jan, 19
Human Re-identification System On Highly Parallel GPU and CPU Architectures
The paper presents a new approach to the human reidentification problem using covariance features. In many cases a distance operator between signatures based on generalized eigenvalues has to be computed efficiently, especially once the real-time response is expected from the system. This is a challenging problem as many procedures are computationally intensive tasks and must […]
Jan, 19
Power Profiling and Optimization for Heterogeneous Multi-Core Systems
Processing speed and energy efficiency are two of the most critical issues for computer systems. This paper presents a systematic approach for profiling the power and performance characteristics of application targeting heterogeneous multi-core computing platforms. Our approach enables rapid and automated design space exploration involving optimisation of workload distribution for systems with accelerators such as […]
Jan, 19
Asymptotic Peak Utilisation in Heterogeneous Parallel CPU/GPU Pipelines: A Decentralised Queue Monitoring Strategy
Heterogeneous parallel computing has become an unavoidable consequence of the emergence of GeneralPurpose computing on graphics processing units (GPGPU). The characteristics of a Graphics Processing Unit (GPU)-including significant memory transfer latency and complex performance characteristics-demand new approaches to ensuring that all available computational resources are geared towards optimal utilisation. This paper considers the simple case […]
Jan, 19
A Compile-Time Managed Multi-Level Register File Hierarchy
As processors increasingly become power limited, performance improvements will be achieved by rearchitecting systems with energy efficiency as the primary design constraint. While some of these optimizations will be hardware based, combined hardware and software techniques likely will be the most productive. This work redesigns the register file system of a modern throughput processor with […]
Jan, 19
Heterogeneity-Aware Resource Allocation and Scheduling in the Cloud
Data analytics are key applications running in the cloud computing environment. To improve performance and cost-effectiveness of a data analytics cluster in the cloud, the data analytics system should account for heterogeneity of the environment and workloads. In addition, it also needs to provide fairness among jobs when multiple jobs share the cluster. In this […]
Jan, 18
Parallel Optimization of Queries in XML Dataset Using GPU
As XML is playing a crucial role in web services, databases, and document processing, efficient processing of XML queries has become an important issue. On the other hand, due to the increasing number of users, high throughput of XML queries is also required to execute tens of thousands of queries in a short time. Given […]
Jan, 18
Solving the Boltzmann equation on GPUs
We show how to accelerate the direct solution of the Boltzmann equation using Graphics Processing Units (GPUs). In order to fully exploit the computational power of the GPU, we choose a method of solution which combines a finite difference discretization of the free-streaming term with a Monte Carlo evaluation of the collision integral. The efficiency […]
Jan, 18
GPU Accelerated Registration of a Statistical Shape Model of the Lumbar Spine to 3D Ultrasound Images
We present a parallel implementation of a statistical shape model registration to 3D ultrasound images of the lumbar vertebrae (L2-L4). Covariance Matrix Adaptation Evolution Strategy optimization technique, along with Linear Correlation of Linear Combination similarity metric have been used, to improve the robustness and capture range of the registration approach. Instantiation and ultrasound simulation have […]
Jan, 18
Cognitive radio network for the smart grid: Experimental system architecture, control algorithms, security, and microgrid testbed
This paper systematically investigates the novel idea of applying the next generation wireless technology, cognitive radio network, for the smart grid. In particular, system architecture, algorithms, and hardware testbed are studied. A microgrid testbed supporting both power flow and information flow is also proposed. Control strategies and security considerations are discussed. Furthermore, the concept of […]
Jan, 18
The ‘Chimera’: an off-the-shelf CPU/GPGPU/FPGA hybrid computing platform
The nature of modern astronomy means that a number of interesting problems exhibit a substantial computational bound and this situation is gradually worsening. Scientists, increasingly fighting for valuable resources on conventional high performance computing (HPC) facilities-often with a limited customizable user environment-are increasingly looking to hardware acceleration solutions. We describe here a heterogeneous CPU/GPGPU/FPGA desktop […]
Jan, 18
Investigation of Parallel Computation – MPI, CUDA and Parallel Visualization
In this manuscript, the parallel computation is investigated including reviewing different programming APIs and architectures. Two specific parallel API-MPI and CUDA C are deeply analyzed. Two sorting algorithms and a visual mathematic problem are implemented with MPI alone with performance analysis. A stable fluid dynamics simulation has been experimented with CUDA. We also present a […]