14816

Posts

Nov, 4

Performance of GTX Titan X GPUs and Code Optimization

Recently Nvidia has released a new GPU model: GTX Titan X (TX) in a linage of the Maxwell architecture. We use our conjugate gradient code and non-perturbative renormalization code to measure the performance of TX. The results are compared with those of GTX Titan Black (TB) in a lineage of the Kepler architecture. We observe […]
Nov, 4

Accelerating Twisted Mass LQCD with QPhiX

We present the implementation of twisted mass fermion operators for the QPhiX library. We analyze the performance on the Intel Xeon Phi (Knights Corner) coprocessor as well as on Intel Xeon Haswell CPUs. In particular, we demonstrate that on the Xeon Phi 7120P the Dslash kernel is able to reach 80% of the theoretical peak […]
Nov, 4

Exact diagonalization of quantum lattice models on coprocessors

We implement the Lanczos algorithm on an Intel Xeon Phi coprocessor and compare its performance to a multi-core Intel Xeon CPU and an NVIDIA graphics processor. The Xeon and the Xeon Phi are parallelized with OpenMP and the graphics processor is programmed with CUDA. The performance is evaluated by measuring the execution time of a […]
Nov, 4

Heterogeneous CPU/(GP) GPU Memory Hierarchy Analysis and Optimization

Heterogeneous systems, more specifically CPU – GPGPU platforms, have gained a lot of attention due to the excellent speedups GPUs can achieve with such little amount of energy consumption. Anyhow, not everything is such a good story, the complex programming models to get the maximum exploitation of the devices and data movement overheads are some […]
Nov, 4

3rd International Conference on Mechanical, Electronics and Computer Engineering (CMECE), 2016

Dear Scholars and Researchers, Warmest Greetings from CMECE 2016! This is 2016 3rd International Conference on Mechanical, Electronics and Computer Engineering (CMECE 2016) conference committee. We are very pleased to tell you that CMECE 2016 will be held in New York, USA during January 07-09, 2016. CMECE2014 and 2015 had been held in Sanya, China […]
Nov, 4

4th International Conference on Nano and Materials Science (ICNMS), 2016

Dear Scholars and Researchers, Warmest Greetings from ICNMS 2016! This is 2016 4th International Conference on Nano and Materials Science (ICNMS 2016) conference committee. We are very pleased to tell you that ICNMS 2016 will be held in New York, USA during January 7-9, 2016. Publication All papers, both invited and contributed, will be reviewed […]
Nov, 3

Structural Agnostic SpMV: Adapting CSR-Adaptive for Irregular Matrices

Sparse matrix vector multiplication (SpMV) is an important linear algebra primitive. Recent research has focused on improving the performance of SpMV on GPUs when using compressed sparse row (CSR), the most frequently used matrix storage format on CPUs. Efficient CSR-based SpMV obviates the need for other GPU-specific storage formats, thereby saving runtime and storage overheads. […]
Nov, 3

Software Defined Radio over CUDA

Software Defined Radio (SDR) is a wireless communication system in which components of transmitters and receivers are mostly implemented by software (filters, mixers, modulators). Thanks to this approach, is possible to implement a single universal radio transceiver, capable of multi-mode and multi-standard wireless communications. These capabilities are very useful for researchers and radio amateur, who […]
Nov, 3

On the programmability of multi-GPU computing systems

Multi-GPU systems are widely used in High Performance Computing environments to accelerate scientific computations. This trend is expected to continue as integrated GPUs will be introduced to processors used in multi-socket servers and servers will pack a higher number of GPUs per node. GPUs are currently connected to the system through the PCI Express interconnect, […]
Nov, 3

Exploring Optimisations for the Local Assembly phase of Finite Element Methods on GPUs

Finite Element Methods (FEM) are ubiquitous in science and engineering where they are used in fields as diverse as structural analysis, ocean modeling and bioengineering. FEM allow us to find approximate solutions to a system of partial differential equations over an unstructured mesh. The first phase of solving a FEM problem, local assembly, involves computing […]
Nov, 3

A Framework for Transparent Execution of Massively-Parallel Applications on CUDA and OpenCL

We present a novel framework for the simultaneous development for different massively parallel platforms. Currently, our framework supports CUDA and OpenCL but it can be easily adapted to other programming languages. The main idea is to provide an easy-to-use abstraction layer that encapsulates the calls of own parallel device code as well as library functions. […]
Oct, 31

Investigation of General-Purpose Computing on Graphics Processing Units and its Application to the Finite Element Analysis of Electromagnetic Problems

In this dissertation, the hardware and API architectures of GPUs are investigated, and the corresponding acceleration techniques are applied on the traditional frequency domain finite element method (FEM), the element-level time-domain methods, and the nonlinear discontinuous Galerkin method. First, the assembly and the solution phases of the FEM are parallelized and mapped onto the granular […]
Page 30 of 866« First...1020...2829303132...405060...Last »

* * *

* * *

Follow us on Twitter

HGPU group

1860 peoples are following HGPU @twitter

Like us on Facebook

HGPU group

406 people like HGPU on Facebook

HGPU group © 2010-2016 hgpu.org

All rights belong to the respective authors

Contact us: