5373

Posts

Aug, 26

Considerations when evaluating microprocessor platforms

Motivated by recent papers comparing CPU and GPU performance, this paper explores the questions: Why do we compare microprocessors and by what means should we compare them? We distinguish two distinct perspectives from which to make comparisons: application developers and computer architecture researchers. We survey the distinct concerns of these groups, identifying essential information each […]
Aug, 26

Exploring graphics processing units as parallel coprocessors for online aggregation

Multidimensional aggregation is one of the most important computational building blocks and hence also a potential performance bottleneck in Online Analytic Processing (OLAP). In order to deliver fast query responses for interactive operations such as slicing, dicing, roll-up and drill-down, it is essential that aggregates along the relevant dimensions of a data cube can be […]
Aug, 26

Parallel Viewshed Analysis on GPU Using CUDA

Viewshed analysis is a long established function of many geographical information systems to determine the visible cells of an input raster from one or more observers. It can be extended into large scale or higher resolution which requires the parallel implementation for time-tolerance. In this paper, we describe a GPU parallelization of viewshed analysis using […]
Aug, 26

GPU Based Real-time Correction for Optical Distortions in Head-Mounted Displays

This paper presents a GPU-based real-time method to correct optical distortions in head-mounted displays (HMDs). The HMD to be corrected is a lightweight and wide field-of-view HMD system with free-form-surface (FFS) prism, in which the image distortion is not rectilinear and centrosymmetric. A special predistortion model is constructed to correct the distortion of the HMD. […]
Aug, 26

Acceleration of an improved Retinex algorithm

Retinex is an image restoration method and the center/surround Retinex is appropriate for parallelization because it utilizes a convolution operation with large kernel size to achieve dynamic range compression and color/lightness rendition. However, its great capability for image enhancement comes with intensive computation. This paper presents a GPURetinex, which is a data parallel algorithm based […]
Aug, 26

Accelerating tetrahedral interpolation with data-level and Thread-Level Parallel optimization

The tetrahedral interpolation method for color space conversion consumes the longest time in the entire color management process. This makes it difficult to implement a purely software-based high-end image processing system. In this study, SIMD (Single Instruction Multiple Data) and GPGPU (General Purpose Graphics Processing Unit) based optimizations for tetrahedral interpolation are implemented. To exploit […]
Aug, 26

Multi-level parallelism, global arrays, GPGPU Programming: Unify programming paradigms on Grid computing with efficiency

As technology advances, computing resources also gain benefits in many aspects: larger capacity, increased capability as well as rapidity. However, with heterogeneously distributed resources in Grid computing environment, the development an application to fully utilize the resources is a challenge. Especially, the computing resources themselves regularly upgrade their computing power for example by recruiting General […]
Aug, 25

TH-1: China’s first petaflop supercomputer

In recent years, heterogeneous systems and cooperative computing have become popular research directions in the field of high performance computing. With fast scaling of the size of high performance computer systems, problems such as power consumption and reliability come to the forefront. The aim of high performance computing has thus shifted from merely seeking peak […]
Aug, 25

Hera-JVM: a runtime system for heterogeneous multi-core architectures

Heterogeneous multi-core processors, such as the IBM Cell processor, can deliver high performance. However, these processors are notoriously difficult to program: different cores support different instruction set architectures, and the processor as a whole does not provide coherence between the different cores’ local memories. We present Hera-JVM, an implementation of the Java Virtual Machine which […]
Aug, 25

Parallelizing compiler framework and API for power reduction and software productivity of real-time heterogeneous multicores

Heterogeneous multicores have been attracting much attention to attain high performance keeping power consumption low in wide spread of areas. However, heterogeneous multicores force programmers very difficult programming. The long application program development period lowers product competitiveness. In order to overcome such a situation, this paper proposes a compilation framework which bridges a gap between […]
Aug, 25

MobiRT: an implementation of OpenGL ES-based CPU-GPU hybrid ray tracer for mobile devices

Three-dimensional user interfaces on mobile devices are increasingly important. For more realistic three-dimensional visualization on mobile devices, we present the implementation of an OpenGL ES-based CPU-GPU hybrid ray tracer. This ray tracer exploits the availability of CPU and GPU architectures to fully support reflection, refraction, hard shadows, and dynamic scenes. To the best of our […]
Aug, 25

Considering GPGPU for HPC Centers: Is It Worth the Effort?

In contrast to just a few years ago, the answer to the question "What system should we buy next to best assist our users" has become a lot more complicated for the operators of an HPC center today. In addition to multicore architectures, powerful accelerator systems have emerged, and the future looks heterogeneous. In this […]

* * *

* * *

HGPU group © 2010-2024 hgpu.org

All rights belong to the respective authors

Contact us: