Kevin Pouget
In this thesis, we propose to study interactive debugging of applications running on embedded systems Multi-Processor System on Chip (MPSoC). A literature study showed that nowadays, the design and development of these applications rely more and more on programming models and development frameworks. These environments gather established algorithmic and programming good-practices, and hence speed up […]
View View   Download Download (PDF)   
Ngai-Man Cheung, Xiaopeng Fan, Oscar C. Au, Man-Cheung Kung
In this article, we investigate using multi-core graphics processing units (GPUs) for video encoding and decoding. After an overview of video coding and GPUs, we review some previous work on structuring video coding modules so that the massive parallel processing capability of GPUs can be harnessed. We also review previous work on partitioning the video […]
View View   Download Download (PDF)   
Ju Ren, Mei Wen, Chunyuan Zhang, Huayou Su, Yi He, Nan Wu
Motion estimation is an important computing intensive component in most video compression standards. The high computational costs and heavy memory bandwidth requirements of motion estimation give huge pressure to most existing programmable processors, especially in real-time high definition H.264 video encoding. Emerging stream processing model supported by most programmable processors provide a powerful mechanism to […]
View View   Download Download (PDF)   
Chi-Wing Fu, Liang Wan, Tien-Tsin Wong, Chi-Sing Leung
Omnidirectional videos are usually mapped to planar domain for encoding with off-the-shelf video compression standards. However, existing work typically neglects the effect of the sphere-to-plane mapping. In this paper, we show that by carefully designing the mapping, we can improve the visual quality, stability and compression efficiency of encoding omnidirectional videos. Here we propose a […]
View View   Download Download (PDF)   
Ngai-Man Cheung, Oscar C. Au, Man-Cheung Kung, Xiaopeng Fan
Rate-distortion (RD) optimized intra-prediction mode selection can lead to significant improvement in coding efficiency in intra-frame encoding. However, it would incur considerable increase in encoding complexity. In this paper, we investigate how multi-core Graphics Processing Units (GPUs) can be efficiently utilized to undertake the task of RD optimized intra mode selection in AVS and H.264 […]
View View   Download Download (PDF)   
Jin Zhao, Xinya Zhang, Xin Wang
Graphics processing unit (GPU) has evolved into a general-purpose computing platform. Inspired by the GPU technology advantage, this paper concerns the design and performance evaluation of practical GPU-accelerated server architecture for Video-on-Demand (VoD) services with network coding. Following the proposal of an optimized network coding algorithm based on parallel threads on GPU, a GPU-Accelerated Server […]
Liangbao Jiao, Jing Zhou, Rui Chen
An intra-prediction mode with 4×4 block and 16×16 block sizes for luma component and 8×8 block size for chroma component is used in H.264 to improve the rate-distortion performance. However, the computational complexity of H.264 encoder is drastically increased due to the various intraprediction modes. Recently efficient hardware architectures were proposed for the fast execution […]
Chuan-Yiu Lee, Yu-Cheng Lin, Chi-Ling Wu, Chin-Hsiang Chang, You-Ming Tsao, Shao-Yi Chien
In this paper, multi-pass and frame parallel algorithms are proposed to accelerate various motion estimation (ME) tools in H.264 with the graphics processing unit (GPU). By the multi-pass method to unroll and rearrange the multiple nested loops, the integer-pel ME can be implemented with two-pass process on GPU. Moreover, fractional ME needs six passes for […]
Pablo Montero, Javier Taibo, Victor Gulias, Samuel Rivas
GPUs excel in parallel computations, so they are very efficient calculating the discrete cosine transform of spatial domain images, as required for video encoding. The last steps of MPEG-2 compression, however, are inherently sequential since they require a serial processing of the resulting DCT coefficients. As that can easily become a bottleneck in GPUbased video […]
Martin Schwalb, Ralph Ewerth, Bernd Freisleben
The video coding standard H.264 supports video compression with a higher coding efficiency than previous standards. However, this comes at the expense of an increased encoding complexity, in particular for motion estimation which becomes a very time consuming task even for today’s central processing units (CPU). On the other hand, modern graphics hardware includes a […]
Guobin Shen,Guang-Ping Gao,Shipeng Li,Heung-Yeung Shum,Ya-Qin Zhang;
Most modern computers or game consoles are equipped with powerful yet cost-effective graphics processing units (GPUs) to accelerate graphics operations. Though the graphics engines in these GPUs are specially designed for graphics operations, can we harness their computing power for more general nongraphics operations? The answer is positive. In this paper, we present our study […]
View View   Download Download (PDF)   
Rodriguez, R. Martinez, J.L. Fernandez-Escribano, G. Claver, J.M. Sanchez, J.L.
H.264/AVC defines a very efficient algorithm for the inter prediction but it takes too much time. With the emergence of general purpose graphics processing units (GPGPU), a new door has been opened to support this video algorithm into these small processing units. In this paper, a forward step is developed towards an implementation of the […]
Page 1 of 212

* * *

* * *

Follow us on Twitter

HGPU group

1665 peoples are following HGPU @twitter

Like us on Facebook

HGPU group

339 people like HGPU on Facebook

* * *

Free GPU computing nodes at hgpu.org

Registered users can now run their OpenCL application at hgpu.org. We provide 1 minute of computer time per each run on two nodes with two AMD and one nVidia graphics processing units, correspondingly. There are no restrictions on the number of starts.

The platforms are

Node 1
  • GPU device 0: nVidia GeForce GTX 560 Ti 2GB, 822MHz
  • GPU device 1: AMD/ATI Radeon HD 6970 2GB, 880MHz
  • CPU: AMD Phenom II X6 @ 2.8GHz 1055T
  • RAM: 12GB
  • OS: OpenSUSE 13.1
  • SDK: nVidia CUDA Toolkit 6.5.14, AMD APP SDK 3.0
Node 2
  • GPU device 0: AMD/ATI Radeon HD 7970 3GB, 1000MHz
  • GPU device 1: AMD/ATI Radeon HD 5870 2GB, 850MHz
  • CPU: Intel Core i7-2600 @ 3.4GHz
  • RAM: 16GB
  • OS: OpenSUSE 12.3
  • SDK: AMD APP SDK 3.0

Completed OpenCL project should be uploaded via User dashboard (see instructions and example there), compilation and execution terminal output logs will be provided to the user.

The information send to hgpu.org will be treated according to our Privacy Policy

HGPU group © 2010-2015 hgpu.org

All rights belong to the respective authors

Contact us: