6112

Posts

Oct, 23

Computer Finite-Difference Time-Domain Simulation of Electromagnetic Wave Propagation using GPUs

The Finite-Difference Time-Domain (FDTD) solution of Maxwell’s equations, a grid-based differential time-domain numerical modeling method, is an approach for the direct modelling of the penetration of structures by continuous plane waves. Although FDTD techniques are considered to be relatively easy to understand and to implement in software, such modelling methods require a high level of […]
Oct, 23

Massively Parallel Algorithms for CFD Simulation and Optimization on Heterogeneous Many-Core Architectures

In this dissertation we provide new numerical algorithms for use in conjunction with simulation based design codes. These algorithms are designed and best suited to run on emerging heterogenous computing architectures which contain a combination of traditional multi-core processors and new programmable many-core graphics processing units (GPUs). We have developed the following numerical algorithms (i) […]
Oct, 23

On the performance of GPU public-key cryptography

Graphics processing units (GPUs) have become increasingly popular over the last years as a cost-effective means of accelerating various computationally intensive tasks. We study the particular case of modular exponentiation, the crucial operation behind most modern public-key cryptography algorithms. We focus our attention on the NVIDIA GT200 architecture, currently one of the most popular for […]
Oct, 23

Computational Biology and Applied Bioinformatics

Nowadays it is difficult to imagine an area of knowledge that can continue developing without the use of computers and informatics. It is not different with biology, that has seen an unpredictable growth in recent decades, with the rise of a new discipline, bioinformatics, bringing together molecular biology, biotechnology and information technology. More recently, the […]
Oct, 23

STT-RAM for Shared Memory in GPUs

In the context of massively parallel processors such as Graphics Processing Units (GPUs), an emerging non-volatile memory – STT-RAM – provides substantial power, area savings, and increased capacity compared to the conventionally used SRAM. The use of highly dense, low static power STT-RAM in processors that run just few threads of execution does not seem […]
Oct, 23

Dataflow-based Design and Implementation of Image Processing Applications

Dataflow is a well known computational model and is widely used for expressing the functionality of digital signal processing (DSP) applications, such as audio and video data stream processing, digital communications, and image processing. These applications usually require real-time processing capabilities and have critical performance constraints. Dataflow provides a formal mechanism for describing specifications of […]
Oct, 22

Accelerating Component-Based Dataflow Middleware with Adaptivity and Heterogeneity

This dissertation presents research into the development of high performance dataflow middleware and applications on heterogeneous, distributed-memory supercomputers. We present coarse-grained state-of-the-art ad-hoc techniques for optimizing the performance of real-world, data-intensive applications in biomedical image analysis and radar signal analysis on clusters of computational nodes equipped with multi-core microprocessors and accelerator processors, such as the […]
Oct, 22

Implementing a Preconditioned Iterative Linear Solver Using Massively Parallel Graphics Processing Units

The research conducted in this thesis provides a robust implementation of a preconditioned iterative linear solver on programmable graphic processing units (GPUs). Solving a large, sparse linear system is the most computationally demanding part of many widely used power system analysis. This thesis presents a detailed study of iterative linear solvers with a focus on […]
Oct, 22

GPU performance prediction using parametrized models

Compilation on modern architectures has become an increasingly difficult challenge with the evolution of computers and computing needs. In particular, programmers expect the compiler to produce optimized code for a variety of hardware, making the most of their theoretical performance. For years this was not a problem because hardware vendors consistently delivered increases in clock […]
Oct, 22

CUDA Application Design and Development

As the computer industry retools to leverage massively parallel graphics processing units (GPUs), this book is designed to meet the needs of working software developers who need to understand GPU programming with CUDA and increase efficiency in their projects. CUDA Application Design and Development starts with an introduction to parallel computing concepts for readers with […]
Oct, 22

Accelerating molecular docking and binding site mapping using FPGAs and GPUs

Computational accelerators such as Field Programmable Gate Arrays (FPGAs) and Graphics Processing Units (GPUs) possess tremendous compute capabilities and are rapidly becoming viable options for effective high performance computing (HPC). In addition to their huge computational power, these architectures provide further benefits of reduced size and power dissipation. Despite their immense raw capabilities, achieving overall […]
Oct, 22

Hardware Transactional Memory for GPU Architectures

Graphics processor units (GPUs) are designed to efficiently exploit thread level parallelism (TLP), multiplexing execution of 1000s of concurrent threads on a relatively smaller set of single-instruction, multiple-thread (SIMT) cores to hide various long latency operations. While threads within a CUDA block/OpenCL workgroup can communicate efficiently through an intra-core scratchpad memory, threads in different blocks […]

Recent source codes

* * *

* * *

HGPU group © 2010-2026 hgpu.org

All rights belong to the respective authors

Contact us: