Aditya Deshpande, P J Narayanan
In this paper, we present an all-core implementation of Burrows Wheeler Compression algorithm that exploits all computing resources on a system. Our focus is to provide significant benefit to everyday users on common end-to-end applications by exploiting the parallelism of multiple CPU cores and many-core GPU on their machines. The all-core framework is suitable for […]
Ajith Padyana, Devi Sudheer, Pallav Kumar Baruah, Ashok Srinivasan
Compute-intensive tasks in high-end high performance computing (HPC) systems often generate large amounts of data, especially floating-point data, that need to be transmitted over the network. Although computation speeds are very high, the overall performance of these applications is affected by the data transfer overhead. Moreover, as data sets are growing in size rapidly, bandwidth […]
View View   Download Download (PDF)   
Md. Enamul Haque, Abdullah Al Kaisan, Mahmudur R Saniat, Aminur Rahman
In this paper, we implemented both sequential and parallel version of fractal image compression algorithms using CUDA (Compute Unified Device Architecture) programming model for parallelizing the program in Graphics Processing Unit for medical images, as they are highly similar within the image itself. There are several improvement in the implementation of the algorithm as well. […]
View View   Download Download (PDF)   
Luis Miguel de la Cruz, Daniel Monsivais
A two-phase (water and oil) flow model in a homogeneous porous media is studied, considering immiscible and incompressible displacement. This model is numerically solved using the Finite Volume Method (FVM) and we compare four numerical schemes for the approximation of fluxes on the faces of the discrete volumes. We describe briefly how to obtain the […]
View View   Download Download (PDF)   
Andre Kessler
We investigate compression of large-volume spatial data using the wavelet transform, computed massively in parallel on NVIDIA graphics processing units (GPUs). In particular, Haar basis wavelets are used to achieve compression ratios of [100x] or more. Computation is done over a set of computing nodes consisting of multiple nodes and multiple GPUs per node. Significantly […]
View View   Download Download (PDF)   
Adnan Ozsoy, Martin Swany, Arun Chauhan
In this paper, we present an algorithm and provide design improvements needed to port the serial Lempel-Ziv-Storer-Szymanski (LZSS), lossless data compression algorithm, to a parallelized version suitable for general purpose graphic processor units (GPGPU), specifically for NVIDIA’s CUDA Framework. The two main stages of the algorithm, substring matching and encoding, are studied in detail to […]
View View   Download Download (PDF)   
Pinghao Li, Xiaoqian Jiang, Shuang Wang, Jihoon Kim, Hongkai Xiong, Lucila Ohno-Machado
BACKGROUND AND OBJECTIVE: Short-read sequencing is becoming the standard of practice for the study of structural variants associated with disease. However, with the growth of sequence data largely surpassing reasonable storage capability, the biomedical community is challenged with the management, transfer, archiving, and storage of sequence data. METHODS: We developed Hierarchical mUlti-reference Genome cOmpression (HUGO), […]
Manas Arora, Neha Maurya
Mars Rovers are the unmanned machines on planet MARS which are send to analyze and provide details about the planet. GPU and Genetic Algorithms are upcoming technologies used in Mars Rovers for analyzing and sending the data back to the Earth base station. GPU stands for Graphics Processing Unit in which Image compression is the […]
View View   Download Download (PDF)   
Wasuwee Sodsong, Jingun Hong, Seongwook Chung, Shin-Dug Kim, Bernd Burgstaller
With the emergence of social networks and improvements in computational photography, billions of JPEG images are shared and viewed on a daily basis. Desktops, tablets and smartphones constitute the vast majority of hardware platforms used for displaying JPEG images. Despite the fact that these platforms are heterogeneous multicores, no approach exists yet that is capable […]
View View   Download Download (PDF)   
K. Shuma Roshini, M. Tejaswi
In video communication whole content of video cannot be stored without processing. So there is a need to compress the video before transmission and storage this process is called as video compression. Video compression plays an important role with regard to real-time scouting/video conferencing applications. Regarding the entire motion based video compression process, movement estimation […]
View View   Download Download (PDF)   
F. Fusco, M. Vlachos, X. Dimitropoulos, L. Deri
Network traffic recorders are devices that record massive volumes of network traffic for security applications, like retrospective forensic investigations. When deployed over very high-speed networks, traffic recorders must process and store millions of packets per second. To enable interactive explorations of such large traffic archives, packet indexing mechanisms are required. Indexing packets at wire rates […]
View View   Download Download (PDF)   
Andreas Weinlich, Johannes Rehm, Peter Amon, Andreas Hutter, Andre Kaup
Medical imaging in hospitals requires fast and efficient image compression to support the clinical work flow and to save costs. Leastsquares autoregressive pixel prediction methods combined with arithmetic coding constitutes the state of the art in lossless image compression. However, a high computational complexity of both prevents the application of respective CPU implementations in practice. […]
View View   Download Download (PDF)   
Page 1 of 812345...Last »

* * *

* * *

* * *

Free GPU computing nodes at

Registered users can now run their OpenCL application at We provide 1 minute of computer time per each run on two nodes with two AMD and one nVidia graphics processing units, correspondingly. There are no restrictions on the number of starts.

The platforms are

Node 1
  • GPU device 0: AMD/ATI Radeon HD 5870 2GB, 850MHz
  • GPU device 1: AMD/ATI Radeon HD 6970 2GB, 880MHz
  • CPU: AMD Phenom II X6 @ 2.8GHz 1055T
  • RAM: 12GB
  • OS: OpenSUSE 11.4
  • SDK: AMD APP SDK 2.8
Node 2
  • GPU device 0: AMD/ATI Radeon HD 7970 3GB, 1000MHz
  • GPU device 1: nVidia GeForce GTX 560 Ti 2GB, 822MHz
  • CPU: Intel Core i7-2600 @ 3.4GHz
  • RAM: 16GB
  • OS: OpenSUSE 12.2
  • SDK: nVidia CUDA Toolkit 5.0.35, AMD APP SDK 2.8

Completed OpenCL project should be uploaded via User dashboard (see instructions and example there), compilation and execution terminal output logs will be provided to the user.

The information send to will be treated according to our Privacy Policy

HGPU group © 2010-2014

All rights belong to the respective authors

Contact us: