14185
Amir Gholami, Judith Hill, Dhairya Malhotra, George Biros
We present a new library for parallel distributed Fast Fourier Transforms (FFT). Despite the large amount of work on FFTs, we show that significant speedups can be achieved for distributed transforms. The importance of FFT in science and engineering and the advances in high performance computing necessitate further improvements. AccFFT extends existing FFT libraries for […]
Fredrik Andersson, Marcus Carlsson, Viktor V. Nikitin
The Radon transform and its adjoint, the back-projection operator, can both be expressed as convolutions in log-polar coordinates. Hence, fast algorithms for the application of the operators can be constructed by using FFT, if data is resampled at log-polar coordinates. Radon data is typically measured on an equally spaced grid in polar coordinates, and reconstructions […]
View View   Download Download (PDF)   
Feifei Shen, Zhenjian Song, Congrui Wu, Jiaqi Geng, Qingyun Wang
Study of general purpose computation by GPU (Graphics Processing Unit) can improve the image processing capability of micro-computer system. This paper studies the parallelism of the different stages of decimation in time radix 2 FFT algorithm, designs the butterfly and scramble kernels and implements 2D FFT on GPU. The experiment result demonstrates the validity and […]
View View   Download Download (PDF)   
Mohamed Amine Bergach, Emilien Kofman, Robert de Simone, Serge Tissot, Michel Syska
General-purpose multiprocessors (as, in our case, Intel IvyBridge and Intel Haswell) increasingly add GPU computing power to the former multicore architectures. When used for embedded applications (for us, Synthetic aperture radar) with intensive signal processing requirements, they must constantly compute convolution algorithms, such as the famous Fast Fourier Transform. Due to its "fractal" nature (the […]
View View   Download Download (PDF)   
Amir Gholami, Judith Hill, Dhairya Malhotra, George Biros
We present a new library for scalable 3-D Fast Fourier Transforms (FFT). Despite the large amount of work on 3-D FFTs, we show that significant speedups can be achieved for large problem sizes and core counts. The importance of FFT in science and engineering and the advances in high performance computing necessitate further improvements in […]
Nicolas Vasilache, Jeff Johnson, Michael Mathieu, Soumith Chintala, Serkan Piantino, Yann LeCun
We examine the performance profile of Convolutional Neural Network training on the current generation of NVIDIA Graphics Processing Units. We introduce two new Fast Fourier Transform convolution implementations: one based on NVIDIA’s cuFFT library, and another based on a Facebook authored FFT implementation, fbfft, that provides significant speedups over cuFFT (over 1.5x) for whole CNNs. […]
Hesamaldin Nekouei
The worth of finding a general solution for nonsymmetric eigenvalue problems is specified in many areas of engineering and science computations, such as reducing noise to have a quiet ride in automotive industrial engineering or calculating the natural frequency of a bridge in civil engineering. The main objective of this thesis is to design a […]
View View   Download Download (PDF)   
Hovhannes Bantikyan
It is well recognized in the computer algebra theory and systems communities that the Fast Fourier Transform (FFT) can be used for multiplying polynomials. Theory predicts that it is fast for "large enough" polynomials. The basic idea is to use fast polynomial multiplication to perform fast integer multiplication. We can achieve really fast FFT multiplication […]
View View   Download Download (PDF)   
Izumi Mizuno, Seiji Kameno, Amane Kano, Makoto Kuroo, Fumitaka Nakamura, Noriyuki Kawaguchi, Katsunori M. Shibata, Seisuke Kuji, Nario Kuno
We have developed a software-based polarization spectrometer, PolariS, to acquire full-Stokes spectra with a very high spectral resolution of 61 Hz. The primary aim of PolariS is to measure the magnetic fields in dense star-forming cores by detecting the Zeeman splitting of molecular emission lines. The spectrometer consists of a commercially available digital sampler and […]
Sagar Shrishailappa Masuti, Sylvain Barbot, Nachiket Kapre
Effective utilization of GPU processing capacity for scientific workloads is often limited by memory throughput and PCIe communication transfer times. This is particularly true for semi-analytic Fourier-domain computations in earthquake modeling (Relax) where operations on large-scale 3D data structures can require moving large volumes of data from storage to the compute in predictable but orthogonal […]
View View   Download Download (PDF)   
Jayanth Chennamangalam, Simon Scott, Glenn Jones, Hong Chen, John Ford, Amanda Kepley, D. R. Lorimer, Jun Nie, Richard Prestage, D. Anish Roshi, Mark Wagner, Dan Werthimer
The Graphics Processing Unit (GPU) has become an integral part of astronomical instrumentation, enabling high-performance online data reduction and accelerated online signal processing. In this paper, we describe a wide-band reconfigurable spectrometer built using an off-the-shelf GPU card. This spectrometer, when configured as a polyphase filter bank (PFB), supports a dual-polarization bandwidth of up to […]
Dale Nicholas Rattermann
Fast Poisson solvers using the Fast Fourier Transform on uniform grids are especially suited for parallel implementation, making them appropriate for portability on graphical processing unit (GPU) devices. The goal of the following work was to implement, test, and evaluate a fast Poisson solver for periodic boundary conditions for use on a variety of GPU […]
View View   Download Download (PDF)   
Page 1 of 1712345...10...Last »

* * *

* * *

Follow us on Twitter

HGPU group

1544 peoples are following HGPU @twitter

Like us on Facebook

HGPU group

276 people like HGPU on Facebook

* * *

Free GPU computing nodes at hgpu.org

Registered users can now run their OpenCL application at hgpu.org. We provide 1 minute of computer time per each run on two nodes with two AMD and one nVidia graphics processing units, correspondingly. There are no restrictions on the number of starts.

The platforms are

Node 1
  • GPU device 0: nVidia GeForce GTX 560 Ti 2GB, 822MHz
  • GPU device 1: AMD/ATI Radeon HD 6970 2GB, 880MHz
  • CPU: AMD Phenom II X6 @ 2.8GHz 1055T
  • RAM: 12GB
  • OS: OpenSUSE 13.1
  • SDK: nVidia CUDA Toolkit 6.5.14, AMD APP SDK 3.0
Node 2
  • GPU device 0: AMD/ATI Radeon HD 7970 3GB, 1000MHz
  • GPU device 1: AMD/ATI Radeon HD 5870 2GB, 850MHz
  • CPU: Intel Core i7-2600 @ 3.4GHz
  • RAM: 16GB
  • OS: OpenSUSE 12.3
  • SDK: AMD APP SDK 3.0

Completed OpenCL project should be uploaded via User dashboard (see instructions and example there), compilation and execution terminal output logs will be provided to the user.

The information send to hgpu.org will be treated according to our Privacy Policy

HGPU group © 2010-2015 hgpu.org

All rights belong to the respective authors

Contact us: