12590
John Ashley, Amy J. Braverman
Multi-trial sampled K-means performance and scalability is studied as a stepping stone towards a Graphical Processing Unit implementation of Entropy Constrained Vector Quantization for interactive data compression. Basic parallelization strategies and data layout impacts are explored with K-means. The K-means implementation is extended to Entropy Constrained Vector Quantization, and additional tuning specific to the anticipated […]
View View   Download Download (PDF)   
Bryan Ching
Lossless data compression is used to reduce storage requirements, allowing for the relief of I/O channels and better utilization of bandwidth. The Lempel-Ziv lossless compression algorithms form the basis for many of the most commonly used compression schemes. General purpose computing on graphic processing units (GPGPUs) allows us to take advantage of the massively parallel […]
View View   Download Download (PDF)   
Jason J. Ford, Timothy L. Molloy, Joanne L. Hall
This paper investigates compressed sensing using hidden Markov models (HMMs) and hence provides an extension of recent single frame, bounded error sparse decoding problems into a class of sparse estimation problems containing both temporal evolution and stochastic aspects. This paper presents two optimal estimators for compressed HMMs. The impact of measurement compression on HMM filtering […]
View View   Download Download (PDF)   
Aditya Deshpande
In earlier times, computer systems had only a single core or processor. In these computers, the number of transistors on-chip (i.e. on the processor) doubled every two years and all applications enjoyed free speedup. Subsequently, with more and more transistors being packed on-chip, power consumption became an issue, frequency scaling reached its limits and industry […]
Jingqi Ao
Lossless compression is still in high demand in medical image applications despite improvements in the computing capability and decrease in storage cost in recent years. With the development of General Purpose Graphic Processing Unit (GPGPU) computing techniques, sequential lossless image compression algorithms can be modified to achieve more efficiency and speed. Backward Coding of Wavelet […]
View View   Download Download (PDF)   
Andrew A. Haigh, Eric C. McCreath
The realistic simulation of ultrasound wave propagation is computationally intensive. The large size of the grid and low degree of reuse of data means that it places a great demand on memory bandwidth. Graphics Processing Units (GPUs) have attracted attention for performing scientific calculations due to their potential for efficiently performing large numbers of floating […]
View View   Download Download (PDF)   
Tran Minh Quan, Won-Ki Jeong
Discrete wavelet transform (DWT) has been widely used in many image compression applications, such as JPEG2000 and compressive sensing MRI. Even though a lifting scheme [1] has been widely adopted to accelerate DWT, only a handful of research has been done on its efficient implementation on many-core accelerators, such as graphics processing units (GPUs). Moreover, […]
View View   Download Download (PDF)   
Hovhannes M. Bantikyan
The discrete wavelet transform has a huge number of applications in science, engineering, mathematics and computer science. Most notably, it is used for signal coding to represent a discrete signal in a more redundant form, often as a preconditioning for data compression. Beginning in the 1990s, wavelets have been found to be a powerful tool […]
View View   Download Download (PDF)   
H.M. Magboub, M.A. Osman
This paper investigates the use of the Compute Unified Device Architecture (CUDA) programming model to implement Discrete Wavelet Transform (DWT) based algorithm for efficient image compression. The PSNR (Peak Signal to Noise Ratio) is used to evaluate image reconstruction quality in this paper. The results are presented and discussed.
View View   Download Download (PDF)   
Rita Silva, Telmo Marques, Jorge Desirat, Patricio Domingues
Many-Core computing is an actual growing concept that allows the true parallelization of computational tasks. In the particular case of this paper, the vector quantization algorithm was adapted to the many-core concept with the objective of compressing images encoded in the PGM format. For that, a given sequential implementation of the algorithm was optimized and […]
View View   Download Download (PDF)   
Aditya Deshpande, P J Narayanan
In this paper, we present an all-core implementation of Burrows Wheeler Compression algorithm that exploits all computing resources on a system. Our focus is to provide significant benefit to everyday users on common end-to-end applications by exploiting the parallelism of multiple CPU cores and many-core GPU on their machines. The all-core framework is suitable for […]
Ajith Padyana, Devi Sudheer, Pallav Kumar Baruah, Ashok Srinivasan
Compute-intensive tasks in high-end high performance computing (HPC) systems often generate large amounts of data, especially floating-point data, that need to be transmitted over the network. Although computation speeds are very high, the overall performance of these applications is affected by the data transfer overhead. Moreover, as data sets are growing in size rapidly, bandwidth […]
View View   Download Download (PDF)   
Page 1 of 912345...Last »

* * *

* * *

Like us on Facebook

HGPU group

143 people like HGPU on Facebook

Follow us on Twitter

HGPU group

1223 peoples are following HGPU @twitter

Featured events

* * *

Free GPU computing nodes at hgpu.org

Registered users can now run their OpenCL application at hgpu.org. We provide 1 minute of computer time per each run on two nodes with two AMD and one nVidia graphics processing units, correspondingly. There are no restrictions on the number of starts.

The platforms are

Node 1
  • GPU device 0: AMD/ATI Radeon HD 5870 2GB, 850MHz
  • GPU device 1: AMD/ATI Radeon HD 6970 2GB, 880MHz
  • CPU: AMD Phenom II X6 @ 2.8GHz 1055T
  • RAM: 12GB
  • OS: OpenSUSE 13.1
  • SDK: AMD APP SDK 2.9
Node 2
  • GPU device 0: AMD/ATI Radeon HD 7970 3GB, 1000MHz
  • GPU device 1: nVidia GeForce GTX 560 Ti 2GB, 822MHz
  • CPU: Intel Core i7-2600 @ 3.4GHz
  • RAM: 16GB
  • OS: OpenSUSE 12.2
  • SDK: nVidia CUDA Toolkit 6.0.1, AMD APP SDK 2.9

Completed OpenCL project should be uploaded via User dashboard (see instructions and example there), compilation and execution terminal output logs will be provided to the user.

The information send to hgpu.org will be treated according to our Privacy Policy

HGPU group © 2010-2014 hgpu.org

All rights belong to the respective authors

Contact us: