Lokendra Singh Panwar
Today, heterogeneous computing has truly reshaped the way scientists think and approach high-performance computing (HPC). Hardware accelerators such as general-purpose graphics processing units (GPUs) and Intel Many Integrated Core (MIC) architecture continue to make in-roads in accelerating large-scale scientific applications. These advancements, however, introduce new sets of challenges to the scientific community such as: selection […]
View View   Download Download (PDF)   
V. Weinberg, M. Allalen, G. Brietzke
This whitepaper aims to discuss first experiences with porting an MPI-based real-world geophysical application to the new Intel Many Integrated Core (MIC) architecture. The selected code SeisSol is an application written in Fortran that can be used to simulate earthquake rupture and radiating seismic wave propagation in complex 3-D heterogeneous materials. The PRACE prototype cluster […]
View View   Download Download (PDF)   
Andreas Berg Skomedal
In the early days of computing, scientific calculations were done by specialized hardware. More recently, increasingly powerful CPUs took over and have been dominant for a long time. Now though, scientific computation is not only for the general CPU environment anymore. GPUs are specialized processors with their own memory hierarchy requiring more effort to program, […]
Ashwin M. Aji, Lokendra S. Panwar, Feng Ji, Milind Chabbi, Karthik Murthy, Pavan Balaji, Keith R. Bisset, James Dinan, Wu-chun Feng, John Mellor-Crummey, Xiaosong May, Rajeev Thakur
Scientific computing applications are quickly adapting to leverage the massive parallelism of GPUs in large-scale clusters. However, the current hybrid programming models require application developers to explicitly manage the disjointed host and GPU memories, thus reducing both efficiency and productivity. Consequently, GPU-integrated MPI solutions, such as MPI-ACC and MVAPICH2-GPU, have been developed that provide unified […]
View View   Download Download (PDF)   
Matti Leinonen,Russell J. Hewett, Xiangxiong Zhang, Lexing Ying, Laurent Demanet
Wave atoms are a low-redundancy alternative to curvelets, suitable for high-dimensional seismic data processing. This abstract extends the wave atom orthobasis construction to 3D, 4D, and 5D Cartesian arrays, and parallelizes it in a shared-memory environment. An implementation of the algorithm for NVIDIA CUDA capable graphics processing units (GPU) is also developed to accelerate computation […]
View View   Download Download (PDF)   
Chang Cai, Haiqing Chen, Ze Deng, Dan Chen, Samee U. Khan, Ke Zeng, Minxiao Wu
Finite difference is a simple, fast and effective numerical method for seismic wave modeling, and has been widely used in forward waveform inversion and reverse time migration. However, intensive calculation of three-dimensional seismic forward modeling has been restricting the industrial application of 3D pre-stack reverse time migration and inversion. Aiming at this problem, in this […]
View View   Download Download (PDF)   
Bo Han, Xinzheng Lu, Zhen Xu, Yi Li
Refined models have been an important development trend of urban regional seismic damage prediction. However, the application of refined models has been limited due to their high computational cost if implemented on traditional Central Processing Unit (CPU) platforms. In recent years, Graphics Processing Unit (GPU) technology has been developed and applied rapidly due to its […]
View View   Download Download (PDF)   
Jing-Bo Chen, Guo-Feng Liu, Hong Liu
Finite-difference depth migration based on one-way wave equation uses second-order, fourth-order, or other finite-order approximations for spatial derivatives. These finite-order approximations often lead to spatial dispersion errors and low accuracy. To avoid these errors, smaller mesh spacings are used, which results in huge increase in computation cost. In this paper, we develop a new spectral […]
View View   Download Download (PDF)   
Shenyi Song, Yichen Zhou, Tingxing Dong, David A. Yuen
The method of Support Operator (SOM) is a numerical method to simulate seismic wave propagation by solving the three dimension vsicoelastic equations. Its implementation, the Support Operator Rupture Dynamics (SORD) has been proved to be highly scalable in large-scale multi-processors calulations. This paper discusses accelarating SORD using on GPU using NVIDIA CUDA C. Compared to […]
View View   Download Download (PDF)   
Max Rietmann, Olaf Schenk, Helmar Burkhart
Among the various techniques for solving hyperbolic partial differential equations with inhomogeneous, irregularly-shaped domains, a relatively new type of finite element method has grown in popularity because of its flexibility and scalability across many parallel cores. Discont inuous Galerkin (DG) methods have shown themselves to be an effective scheme for the simulation of wave-propagation problems […]
View View   Download Download (PDF)   
Ahmed Adnan Aqrawi
In recent years, the gap between bandwidth and computational throughput has become a major challenge in high performance computing (HPC). Data intensive algorithms are particularly affected. by the limitations of I/O bandwidth and latency. In this thesis project, data compression is explored so that fewer bytes need to be read from disk. The computational capabilities […]
View View   Download Download (PDF)   
Ahmed A. Aqrawi, Anne C. Elster
One of the main challenges of modern computer systems is to overcome the ever more prominent limitations of disk I/O and memory bandwidth, which today are thousands-fold slower than computational speeds. In this paper, we investigate reducing memory bandwidth and overall I/O and memory access times by using multithreaded compression and decompression of large datasets. […]
View View   Download Download (PDF)   
Page 1 of 212

* * *

* * *

Follow us on Twitter

HGPU group

1665 peoples are following HGPU @twitter

Like us on Facebook

HGPU group

339 people like HGPU on Facebook

* * *

Free GPU computing nodes at hgpu.org

Registered users can now run their OpenCL application at hgpu.org. We provide 1 minute of computer time per each run on two nodes with two AMD and one nVidia graphics processing units, correspondingly. There are no restrictions on the number of starts.

The platforms are

Node 1
  • GPU device 0: nVidia GeForce GTX 560 Ti 2GB, 822MHz
  • GPU device 1: AMD/ATI Radeon HD 6970 2GB, 880MHz
  • CPU: AMD Phenom II X6 @ 2.8GHz 1055T
  • RAM: 12GB
  • OS: OpenSUSE 13.1
  • SDK: nVidia CUDA Toolkit 6.5.14, AMD APP SDK 3.0
Node 2
  • GPU device 0: AMD/ATI Radeon HD 7970 3GB, 1000MHz
  • GPU device 1: AMD/ATI Radeon HD 5870 2GB, 850MHz
  • CPU: Intel Core i7-2600 @ 3.4GHz
  • RAM: 16GB
  • OS: OpenSUSE 12.3
  • SDK: AMD APP SDK 3.0

Completed OpenCL project should be uploaded via User dashboard (see instructions and example there), compilation and execution terminal output logs will be provided to the user.

The information send to hgpu.org will be treated according to our Privacy Policy

HGPU group © 2010-2015 hgpu.org

All rights belong to the respective authors

Contact us: