Zhang Jingbo
Graph mining and data management has become a significant area because more and more new applications to various data mining problems in social networking, computational biology, chemical data analysis and drug discovery are emerging recently. Although traditional mining methods have been extended to process graphs, many graph applications still confront huge challenges due to continuous […]
View View   Download Download (PDF)   
Yunpeng Cao
To monitor bad information spreading in microblog system, large-scale data from microblog must be processed in real time. This needs high cost-effective parallel schemes. A parallel processing method on GPUs was put forward to monitor massive microblog. The proposed scheme can fully exploit the GPU feature to schedule massive threads for data-intensive tasks. The detailed […]
View View   Download Download (PDF)   
Meisam Askari, Hossein Ebrahimpour, Azam Asilian Bidgoli, Farahnaz Hosseini
Hough transform is one of the most widely used algorithms in image processing. The major problems of Hough’s transform are its time consuming and its abundant requirement of computational resources. In this paper, we try to solve this problem by paralleling this algorithm and implementing it on GPUs (Graphic Process unit) using CUDA (Compute Unified […]
View View   Download Download (PDF)   
Joseph Issa
The change in processor architectures and 3D benchmarks makes performance characterization important for every processor and 3D application generation. Recent 3D applications require large amount of data to be processed by the GPU and the CPU. This leads to the importance in analyzing processor performance for different architectures and benchmarks so that benchmarks and processors […]
View View   Download Download (PDF)   
Saurabh Maniktala, Anisha Goel, A. B. Patki, R. C. Meharde
In the present era Chaos theory has tremendous potential in Computer Science Domain. The true potential of Chaos theory can be realized with the assistance of high performance computing aids such as GPU that have become available in present times. The main purpose is to develop a high performance experimental laboratory in academic institutions, for […]
View View   Download Download (PDF)   
Changmin Lee, Won W. Ro, Jean-Luc Gaudiot
This paper presents a cooperative heterogeneous computing framework which enables the efficient utilization of available computing resources of host CPU cores for CUDA kernels, which are designed to run only on GPU. The proposed system exploits at runtime the coarse-grain threadlevel parallelism across CPU and GPU, without any source recompilation. To this end, three features […]
View View   Download Download (PDF)   
Francisco Giunta, Raffaele Montella, Giuliano Laccetti, Florin Isaila, Francisco Javier Garcia Blas
Numerical models play a main role in the earth sciences, filling in the gap between experimental and theoretical approach. Nowadays, the computational approach is widely recognized as the complement to the scientific analysis. Meanwhile, the huge amount of observed/modelled data, and the need to store, process, and refine them, often makes the use of high […]
View View   Download Download (PDF)   
Chen He
Molecular Dynamics (MD) simulation is a computationally intensive application used in multiple fields. It can exploit a distributed environment due to inherent computational parallelism. However, most of the existing implementations focus on performance enhancement. They may not provide fault-tolerance for every time-step. MapReduce is a framework first proposed by Google for processing huge amounts of […]
View View   Download Download (PDF)   
S.J. Pennycook, S.D. Hammond, S.A. Jarvis, G.R. Mudalige
We present the performance analysis of a port of the LU benchmark from the NAS Parallel Benchmark (NPB) suite to NVIDIA’s Compute Unified Device Architecture (CUDA), and report on the optimisation efforts employed to take advantage of this platform. Execution times are reported for several different GPUs, ranging from low-end consumergrade products to high-end HPC-grade […]
View View   Download Download (PDF)   
L. Stolz, H. Endt, M. Vaaraniemi, D. Zehe, W. Stechele
With the introduction of API’s like CUDA, Stream+ or OpenCL, modern Graphics Processing Units (GPU’s) can be easily employed for general purpose computing. Plus, their comparatively low price per GFLOP makes them interesting candidates for coprocessors in future embedded Electronic Control Units (ECUs). Yet, as car manufacturers thrive to reduce the Thermal Design Power (TDP) […]
View View   Download Download (PDF)   
Specifications GPU G96a/b FLOPS 67.2 GFLOPS Stream Processing Units 16 Core Clock 550 MHz Memory Clock 1400 MHz Effective Memory Clock 2800 MHz Memory Type DDR2/GDDR3 Amount of memory 256/512/1024 MB Memory Bandwidth 12.8/25.6 GB/sec Buswidth 128 bit Tech process 65/55 nm Interface PCIe 2.0 x16, PCI PS/VS version 4.1/4.1 DirectX compliance 10 Retail Cards […]
Gorka Lerchundi Osa
User needs increases as time passes. We started with computers like the size of a room where the perforated plaques did the same function as the current machine code object does and at present we are at a point where the number of processors within our graphic device unit it’s not enough for our requirements. […]
View View   Download Download (PDF)   

* * *

* * *

Follow us on Twitter

HGPU group

1666 peoples are following HGPU @twitter

Like us on Facebook

HGPU group

339 people like HGPU on Facebook

* * *

Free GPU computing nodes at hgpu.org

Registered users can now run their OpenCL application at hgpu.org. We provide 1 minute of computer time per each run on two nodes with two AMD and one nVidia graphics processing units, correspondingly. There are no restrictions on the number of starts.

The platforms are

Node 1
  • GPU device 0: nVidia GeForce GTX 560 Ti 2GB, 822MHz
  • GPU device 1: AMD/ATI Radeon HD 6970 2GB, 880MHz
  • CPU: AMD Phenom II X6 @ 2.8GHz 1055T
  • RAM: 12GB
  • OS: OpenSUSE 13.1
  • SDK: nVidia CUDA Toolkit 6.5.14, AMD APP SDK 3.0
Node 2
  • GPU device 0: AMD/ATI Radeon HD 7970 3GB, 1000MHz
  • GPU device 1: AMD/ATI Radeon HD 5870 2GB, 850MHz
  • CPU: Intel Core i7-2600 @ 3.4GHz
  • RAM: 16GB
  • OS: OpenSUSE 12.3
  • SDK: AMD APP SDK 3.0

Completed OpenCL project should be uploaded via User dashboard (see instructions and example there), compilation and execution terminal output logs will be provided to the user.

The information send to hgpu.org will be treated according to our Privacy Policy

HGPU group © 2010-2015 hgpu.org

All rights belong to the respective authors

Contact us: