12543
Istvan Zoltan Reguly
The last decade saw the long tradition of frequency scaling of processing units grind to a halt, and efforts were re-focused on maintaining computational growth by other means; such as increased parallelism, deep memory hierarchies and complex execution logic. After a long period of "boring productivity", a host of new architectures, accelerators, programming languages and […]
View View   Download Download (PDF)   
Fabio Luporini, Ana Lucia Varbanescu, Florian Rathgeber, Gheorghe-Teodor Bercea, J. Ramanujam, David A. Ham, Paul H.J. Kelly
The numerical solution of partial differential equations using the finite element method is one of the key applications of high performance computing. Local assembly is its characteristic operation. This entails the execution of a problem-specific kernel to numerically evaluate an integral for each element in the discretized problem domain. Since the domain size can be […]
View View   Download Download (PDF)   
Manob Jyoti Saikia, Rajan Kanhirodan, Ram Mohan Vasu
We have developed a graphics processor unit (GPU-) based high-speed fully 3D system for diffuse optical tomography (DOT). The reduction in execution time of 3D DOT algorithm, a severely ill-posed problem, is made possible through the use of (1) an algorithmic improvement that uses Broyden approach for updating the Jacobian matrix and thereby updating the […]
View View   Download Download (PDF)   
Tingxing Dong, Veselin Dobrev, Tzanio Kolev, Robert Rieben, Stanimire Tomov, Jack Dongarra
Power and energy consumption are becoming an increasing concern in high performance computing. Compared to multi-core CPUs, GPUs have a much better performance per watt. In this paper we discuss efforts to redesign the most computation intensive parts of BLAST, an application that solves the equations for compressible hydrodynamics with high order finite elements, using […]
View View   Download Download (PDF)   
Zheng Xia
By consulting the state-of-the-art methods on massive linear equations solving and parallel computing, the main issue of calculation have been extracted from finite element method. The author test some solving routines on the CPU based as well as design and implement on GPU by using CUDA. The coalesced access result on GPU shows a ten […]
View View   Download Download (PDF)   
Ramin Mafi
Computer-based surgical simulation and non-rigid medical image registration in image-guided interventions are examples of applications that would benefit from real-time deformation simulation of soft tissues. The physics of deformation for biological soft-tissue is best described by nonlinear continuum mechanics-based models which then can be discretized by the Finite Element Method (FEM) for a numerical solution. […]
View View   Download Download (PDF)   
M.Wozniak, K.Kuznik, M. Paszynski, V. M. Calo, D. Pardo
In this paper we present computational cost estimates for parallel shared memory isogeometric multi-frontal solver. The estimates show that the ideal isogeometric shared memory parallel direct solver scales as O(p^2 log(N/p)) for one dimensional problems, O(Np^2) for two dimensional problems, and O(N^(4/3)p^2) for three dimensional problems, where N is the number of degrees of freedom, […]
View View   Download Download (PDF)   
Bjorn Nutti, Dragan Marinkovic
The paper presents a highly efficient way of simulating the dynamic behavior of deformable objects by means of the finite element method (FEM) with computations performed on Graphics Processing Units (GPU). The presented implementation reduces bottlenecks related to memory accesses by grouping the necessary data per node pairs, in contrast to the classical way done […]
View View   Download Download (PDF)   
A. F. P. Camargos, V. C. Silva
We present a performance analysis of a parallel implementation of both conjugate gradient and preconditioned conjugate gradient solvers using graphic processing units with CUDA parallel programming model. The solvers were optimized for a fast solution of sparse systems of equations arising from Finite Element Analysis (FEA) of electromagnetic phenomena. The preconditioners were Incomplete Cholesky factorization […]
View View   Download Download (PDF)   
Graham Robert Markall
How do we build maintainable, robust, and performance-portable scientific applications? This thesis argues that the answer to this software engineering question in the context of the finite element method is through the use of layers of Domain-Specific Languages (DSLs) to separate the various concerns in the engineering of such codes. Performance-portable software achieves high performance […]
View View   Download Download (PDF)   
M. Paszynski, K. Kuznik, V.M. Calo, D. Pardo
We present a multi-frontal direct solver for two dimensional isogeometric finite element method simulations with NVIDIA CUDA and perform numerical experiments for linear, quadratic and cubic B-splines. We compare the computational cost O(Np^2) for 2D parallel shared memory implementation with the corresponding estimate O(N^1.5p^3) for a standard 2D sequential implementation. We conclude the presentation with […]
View View   Download Download (PDF)   
Krzysztof Banas, Przemyslaw Plaszewski, Pawel Maciol
The paper considers the problem of implementation on graphics processors of numerical integration routines for higher order finite element approximations. The design of suitable GPU kernels is investigated in the context of general purpose integration procedures, as well as particular example applications. The most important characteristic of the problem investigated is the large variation of […]
View View   Download Download (PDF)   
Page 1 of 612345...Last »

* * *

* * *

Like us on Facebook

HGPU group

128 people like HGPU on Facebook

Follow us on Twitter

HGPU group

1193 peoples are following HGPU @twitter

Featured events

* * *

Free GPU computing nodes at hgpu.org

Registered users can now run their OpenCL application at hgpu.org. We provide 1 minute of computer time per each run on two nodes with two AMD and one nVidia graphics processing units, correspondingly. There are no restrictions on the number of starts.

The platforms are

Node 1
  • GPU device 0: AMD/ATI Radeon HD 5870 2GB, 850MHz
  • GPU device 1: AMD/ATI Radeon HD 6970 2GB, 880MHz
  • CPU: AMD Phenom II X6 @ 2.8GHz 1055T
  • RAM: 12GB
  • OS: OpenSUSE 13.1
  • SDK: AMD APP SDK 2.9
Node 2
  • GPU device 0: AMD/ATI Radeon HD 7970 3GB, 1000MHz
  • GPU device 1: nVidia GeForce GTX 560 Ti 2GB, 822MHz
  • CPU: Intel Core i7-2600 @ 3.4GHz
  • RAM: 16GB
  • OS: OpenSUSE 12.2
  • SDK: nVidia CUDA Toolkit 6.0.1, AMD APP SDK 2.9

Completed OpenCL project should be uploaded via User dashboard (see instructions and example there), compilation and execution terminal output logs will be provided to the user.

The information send to hgpu.org will be treated according to our Privacy Policy

HGPU group © 2010-2014 hgpu.org

All rights belong to the respective authors

Contact us: