Lihui Zhang, Cuiying Guo, Yong Tang, Mengya Lv, Ying Li, Jing Zhao
Using tiled small patches construct deep ocean surface, which greatly reduce the calculation, is a very common approach in ocean simulation. But it leads foams on surface to obviously repetitive artifacts. We proposed a dynamic threshold condition based on global coordinates that control and reduce the periodic repetition. Using FFT method generate a small patch […]
View View   Download Download (PDF)   
Carlo del Mundo, Wu-chun Feng
The fast Fourier transform (FFT), a spectral method that computes the discrete Fourier transform and its inverse, pervades many applications in digital signal processing, such as imaging, tomography, and software-defined radio. Its importance has caused the research community to expend significant resources to accelerate the FFT, of which FFTW is the most prominent example. With […]
View View   Download Download (PDF)   
Dusan B. Gajic
Galois field (GF) expressions are polynomials used as representations of multiple-valued logic (MVL) functions. For this purpose, MVL functions are considered as functions defined over a finite (Galois) field of order p – GF(p). The problem of computing these functional expressions has an important role in areas such as digital signal processing and logic design. […]
View View   Download Download (PDF)   
Tarun Beri, Sorav Bansal, Subodh Kumar
We present a system that enables simple and intuitive programming of CPU+GPU clusters. This system relieves the programmer of the burden of load balancing, detailed data communication, task mapping, scheduling, etc. Our programming model is based on bulk synchronous distributed shared memory model, which is suitable for heterogenous multi-GPU clusters, especially so for compute intensive […]
View View   Download Download (PDF)   
Michael Mathieu, Mikael Henaff, Yann LeCun
Convolutional networks are one of the most widely employed architectures in computer vision and machine learning. In order to leverage their ability to learn complex functions, large amounts of data are required for training. Training a large convolutional network to produce state-of-the-art results can take weeks, even when using modern GPUs. Producing labels using a […]
View View   Download Download (PDF)   
Duncan McBain
This aim of this project was to port the FFT routines of LLRP to CUDA, which was done successfully. This success is quantified as the FFT portions of the program executing in a much shorter time than the FFTW transforms. The project shows that GPUs are certainly viable for use in numerical codes such as […]
View View   Download Download (PDF)   
Shuo Chen
Graphic Processing Units (GPU) has been proved to be a promising platform to accelerate large size Fast Fourier Transform (FFT) computation. However, current GPU-based FFT implementation only uses GPU to compute, but employs CPU as a mere memory-transfer controller. The computation power in today’s high-performance CPU is wasted. In this project, a hybrid optimization framework […]
View View   Download Download (PDF)   
Xavier Gibert Serra, Vishal M. Patel, Demetrio Labate, Rama Chellappa
Shearlets have emerged in recent years as one of of the most successful methods for the multiscale analysis of multidimensional signals. Unlike wavelets, shearlets form a pyramid of well-localized functions defined not only over a range of scales and locations, but also over a range of orientations and with highly anisotropic supports. As a result, […]
Thai V. Hoang, Xavier Cavin, Patrick Schultz, David W. Ritchie
BACKGROUND: Picking images of particles in cryo-electron micrographs is an important step in solving the 3D structures of large macromolecular assemblies. However, in order to achieve sub-nanometre resolution it is often necessary to capture and process many thousands or even several millions of 2D particle images. Thus, a computational bottleneck in reaching high resolution is […]
View View   Download Download (PDF)   
M. Harper Langston, Muthu Baskaran, Benoit Meister, Nicolas Vasilache, Richard Lethin
As distributed memory systems grow larger, communication demands have increased. Unfortunately, while the costs of arithmetic operations continue to decrease rapidly, communication costs have not. As a result, there has been a growing interest in communication-avoiding algorithms for some of the classic problems in numerical computing, including communication-avoiding Fast Fourier Transforms (FFTs). A previously-developed low-communication […]
View View   Download Download (PDF)   
Thai V. Hoang, Xavier Cavin, David W. Ritchie
Fitting high resolution protein structures into low resolution cryo-electron microscopy (cryo-EM) density maps is an important technique for modeling the atomic structures of very large macromolecular assemblies. This article presents "gEMfitter", a highly parallel fast Fourier transform (FFT) EM density fitting program which can exploit the special hardware properties of modern graphics processor units (GPUs) […]
View View   Download Download (PDF)   
Joao Andrade, Vitor Silva, Gabriel Falcao
The FFT plays a fundamental role in OFDM programmable digital baseband communication systems under the SDR context. The core nature of this algorithm marks it as a primary target for acceleration. Since long frame lengths of the FFT are desirable in order to achieve higher bitrates, the computational complexity becomes even more significant. In this […]
View View   Download Download (PDF)   
Page 1 of 1412345...10...Last »

* * *

* * *

* * *

Free GPU computing nodes at

Registered users can now run their OpenCL application at We provide 1 minute of computer time per each run on two nodes with two AMD and one nVidia graphics processing units, correspondingly. There are no restrictions on the number of starts.

The platforms are

Node 1
  • GPU device 0: AMD/ATI Radeon HD 5870 2GB, 850MHz
  • GPU device 1: AMD/ATI Radeon HD 6970 2GB, 880MHz
  • CPU: AMD Phenom II X6 @ 2.8GHz 1055T
  • RAM: 12GB
  • OS: OpenSUSE 11.4
  • SDK: AMD APP SDK 2.8
Node 2
  • GPU device 0: AMD/ATI Radeon HD 7970 3GB, 1000MHz
  • GPU device 1: nVidia GeForce GTX 560 Ti 2GB, 822MHz
  • CPU: Intel Core i7-2600 @ 3.4GHz
  • RAM: 16GB
  • OS: OpenSUSE 12.2
  • SDK: nVidia CUDA Toolkit 5.0.35, AMD APP SDK 2.8

Completed OpenCL project should be uploaded via User dashboard (see instructions and example there), compilation and execution terminal output logs will be provided to the user.

The information send to will be treated according to our Privacy Policy

HGPU group © 2010-2014

All rights belong to the respective authors

Contact us: