10593

Posts

Jul, 7

Comparative study of parallel programming models for multicore computing

Shared memory multi-core processor technology has seen a drastic development with faster and increasing number of processors per chip. This new architecture challenges computer programmers to write code that scales over these many cores to exploit full computational power of these machines. Shared-memory parallel programming paradigms such as OpenMP and Intel Threading Building Blocks (TBB) […]
Jul, 5

Triangular mesh simplification on the GPU

We present a simplification algorithm for triangular meshes, implemented on the GPU. The algorithm performs edge collapses driven by a quadric error metric. It uses data parallelism as provided by OpenCL and has no sequential segments in its main iterative structure in order to fully exploit the processing power of the GPU. Our implementation produces […]
Jul, 2

CFD Simulation of Jet Cooling and Implementation of Flow Solvers in GPU

In rolling of steel into thin sheets the final step is the cooling of the finished product on the Runout Table. In this thesis, the heat transfer into a water jet impinging on a hot flat steel plate was studied as the key cooling process on the runout table. The temperature of the plate was […]
Jul, 1

Towards Performance-Portable, Scalable, and Convenient Linear Algebra

The rise of multi- and many-core architectures also gave birth to a plethora of new parallel programming models. Among these, the open industry standard OpenCL addresses this heterogeneity of programming environments by providing a unified programming framework. The price to pay, however, is that OpenCL requires additional low-level boilerplate code, when compared to vendor-specific solutions, […]
Jun, 29

Adaptation of algorithms for underwater sonar data processing to GPU-based systems

In this master thesis, algorithms for acoustic simulations in underwater environments are ported for GPU processing. The GPU parallel computing platforms used are CUDA, OpenCL and SkePU. The purpose of this master thesis is to adapt and evaluate the ported algorithms’ performance on two modern NVIDIA GPUs, Tesla K20 and Quadro K5000. Several optimizations, described […]
Jun, 29

Efficient computation of constrained parameterizations on parallel platforms

Constrained isometric planar parameterizations are central to a broad spectrum of applications. In this work, we present a non linear solver developed on OpenCL that is efficiently parallelizable on modern massively parallel architectures. We establish how parameterization relates to mesh smoothing and show how to ciently and robustly solve the planar mesh parameterization problem with […]
Jun, 24

An Energy Efficient GPGPU Memory Hierarchy with Tiny Incoherent Caches

With progressive generations and the ever-increasing promise of computing power, GPGPUs have been quickly growing in size, and at the same time, energy consumption has become a major bottleneck for them. The first level data cache and the scratchpad memory are critical to the performance of a GPGPU, but they are extremely energy inefficient due […]
Jun, 21

Parallel Language Programming In Different Platforms

The need to speed-up computing has introduced the interest to explore parallelism in algorithms and parallel programming. Technology is evolving fast but computing power in sequential execution is not increasing as much as earlier but CPUs contain more and more parallel computing resources. However, parallel algorithms may not be able to exploit all the parallelism […]
Jun, 17

GPU Programming in Rust: Implementing High Level Abstractions in a Systems Level Language

Graphics processing units (GPUs) have the potential to greatly accelerate many applications, and yet programming models still remain too low level. Many language-based solutions to date have addressed this problem by creating embedded domain-specific languages that compile to CUDA or OpenCL. These targets are meant for human programmers and thus are less than ideal compilation […]
Jun, 12

Performance of a GPU-based Direct Summation Algorithm for Computation of Small Angle Scattering Profile

Small Angle Scattering (SAS) of X-rays or neutrons is an experimental technique that provides valuable structural information for biological macromolecules under physiological conditions and with no limitation on the molecular size. In order to refine molecular structure against experimental SAS data, ab initio prediction of the scattering profile must be recomputed hundreds of thousands of […]
Jun, 10

Processing XPath Structural Constraints on GPU

Technologies such as CUDA and OpenCL have popularized the usage of graphics cards (GPUs) for general purpose programming, often with impressive performance gains. However, using such cards for speeding up XML Databases processing is yet to be fully explored. XML databases offer much flexibility for Web-oriented systems. Nonetheless, such flexibility comes at a considerable computational […]
Jun, 8

Accelerated Dynamic Programming on GPU: A Study of Speed Up and Programming Approach

GPUs (Graphics processing units) can be used for general purpose parallel computation. Developers can develop parallel programs running on GPUs using different computing architectures like CUDA or OpenCL. The Optimal Matrix Chain Multiplication problem is an optimization problem to find the optimal order for multiplying a chain of matrices. The optimal order of multiplication depends […]

* * *

* * *

HGPU group © 2010-2024 hgpu.org

All rights belong to the respective authors

Contact us: