Tom Runia
In this thesis we design, implement and study a high-speed object detection framework. Our baseline detector uses integral channel features as object representation and AdaBoost as supervised learning algorithm. We suggest the implementation of two approximation techniques for speeding up the baseline detector and show their effectiveness by performing experiments on both detection quality and […]
View View   Download Download (PDF)   
Di Zhao
Mobile GPU computing, or System on Chip with embedded GPU (SoC GPU), becomes in great demand recently. Since these SoCs are designed for mobile devices with real-time applications such as image processing and video processing, high-efficient implementations of wavelet transform are essential for these chips. In this paper, we develop two SoC GPU based DWT: […]
View View   Download Download (PDF)   
Roelof Kemp
In this thesis we described two frameworks for distributed smartphone computing, one for applications with compute intensive tasks and another one for applications that take contextual sensor information into account. Both frameworks provide a common structure for the development of distributed smartphone applications, thereby extending the possible distribution model options for distributed smartphone applications. Both […]
View View   Download Download (PDF)   
Benjamin Y. Cho, Won Seob Jeong, Doohwan Oh, Won Woo Ro
Considerable research has been conducted recently on near-data processing techniques as real-world tasks increasingly involve large-scale and high-dimensional data sets. The advent of solid-state drives (SSDs) has spurred further research because of their processing capability and high internal bandwidth. However, the data processing capability of conventional SSD systems have not been impressive. In particular, they […]
View View   Download Download (PDF)   
Kwang-Ting (Tim) Cheng, Xin Yang and Yi-Chu Wang
Optimizing performance of compute-intensive vision apps running on mobile application processor (AP) is critical to satisfactory experience for smartphone and tablet users. Most existing vision algorithms were primarily designed and implemented for desktop and server platforms. Porting them to a mobile platform without adapting the algorithms to account for the platform’s limitations would cause serious […]
View View   Download Download (PDF)   
Roelof Kemp, Nicholas Palmer, Thilo Kielmann, Henri Bal, Bastiaan Aarts, Anwar Ghuloum
The processing power of mobile devices is continuously increasing. In this paper we perform a case study in which we assess three different programming models that can be used to leverage this processing power for compute intensive tasks. We use an imaging algorithm and compare a reference implementation of this algorithm based on OpenCV with […]
View View   Download Download (PDF)   
Blaine Rister, Guohui Wang, Michael Wu, Joseph R. Cavallaro
Emerging mobile applications, such as augmented reality, demand robust feature detection at high frame rates. We present an implementation of the popular Scale-Invariant Feature Transform (SIFT) feature detection algorithm that incorporates the powerful graphics processing unit (GPU) in mobile devices. Where the usual GPU methods are inefficient on mobile hardware, we propose a heterogeneous dataflow […]
View View   Download Download (PDF)   
Kwang-Ting (Tim) Cheng, Yi-Chu Wang
As GPU becomes an integrated component in handheld devices like smartphones, we have been investigating the opportunities and limitations of utilizing the ultra-low-power GPU in a mobile platform as a general-purpose accelerator, similar to its role in desktop and server platforms. The special focus of our investigation has been on mobile GPU’s role for energy-optimized […]
View View   Download Download (PDF)   

* * *

* * *

Follow us on Twitter

HGPU group

1665 peoples are following HGPU @twitter

Like us on Facebook

HGPU group

339 people like HGPU on Facebook

* * *

Free GPU computing nodes at hgpu.org

Registered users can now run their OpenCL application at hgpu.org. We provide 1 minute of computer time per each run on two nodes with two AMD and one nVidia graphics processing units, correspondingly. There are no restrictions on the number of starts.

The platforms are

Node 1
  • GPU device 0: nVidia GeForce GTX 560 Ti 2GB, 822MHz
  • GPU device 1: AMD/ATI Radeon HD 6970 2GB, 880MHz
  • CPU: AMD Phenom II X6 @ 2.8GHz 1055T
  • RAM: 12GB
  • OS: OpenSUSE 13.1
  • SDK: nVidia CUDA Toolkit 6.5.14, AMD APP SDK 3.0
Node 2
  • GPU device 0: AMD/ATI Radeon HD 7970 3GB, 1000MHz
  • GPU device 1: AMD/ATI Radeon HD 5870 2GB, 850MHz
  • CPU: Intel Core i7-2600 @ 3.4GHz
  • RAM: 16GB
  • OS: OpenSUSE 12.3
  • SDK: AMD APP SDK 3.0

Completed OpenCL project should be uploaded via User dashboard (see instructions and example there), compilation and execution terminal output logs will be provided to the user.

The information send to hgpu.org will be treated according to our Privacy Policy

HGPU group © 2010-2015 hgpu.org

All rights belong to the respective authors

Contact us: