9898

Posts

Jul, 1

vSMC: Parallel Sequential Monte Carlo in C++

Sequential Monte Carlo is a family of algorithms for sampling from a sequence of distributions. Some of these algorithms, such as particle filters, are widely used in the physics and signal processing researches. More recent developments have established their application in more general inference problems such as Bayesian modeling. These algorithms have attracted considerable attentions […]
Jul, 1

A Survey On Parallelization Of Data Mining Techniques

This paper contains the overview of various parallelization techniques to improve the performance of existing data mining algorithms and make the capable of handling large amount of data. There are variety of techniques to achieve the parallelization in data mining field, in this paper a brief introduction to few of the popular techniques is presented. […]
Jul, 1

Generating Efficient Data Movement Code for Heterogeneous Architectures with Distributed-Memory

Programming for parallel architectures that do not have a shared address space is extremely difficult due to the need for explicit communication between memories of different compute devices. A heterogeneous system with CPUs and multiple GPUs, or a distributed-memory cluster are examples of such systems. Past works that try to automate data movement for distributed-memory […]
Jul, 1

Exploiting multi-level parallelism in streaming applications for heterogeneous platforms with GPUs

Heterogeneous computing platforms support the traditional types of parallelism, such as e.g., instruction-level, data, task, and pipeline parallelism, and provide the opportunity to exploit a combination of different types of parallelism at different platform levels. The architectural diversity of platform components makes tapping into the platform potential a challenging programming task. This thesis makes an […]
Jul, 1

Towards Performance-Portable, Scalable, and Convenient Linear Algebra

The rise of multi- and many-core architectures also gave birth to a plethora of new parallel programming models. Among these, the open industry standard OpenCL addresses this heterogeneity of programming environments by providing a unified programming framework. The price to pay, however, is that OpenCL requires additional low-level boilerplate code, when compared to vendor-specific solutions, […]
Jun, 30

Cropped Quad-Tree Based Solid Object Colouring with CUDA

In this study, surfaces of solid objects are coloured with Cropped Quad-Tree method utilizing GPU computing optimization. There are numerous methods used in solid object colouring. When the studies carried out in different fields are taken into consideration, it is seen that quad-tree method displays a prominent position in terms of speed and performance. Cropped […]
Jun, 30

Accelerating SELECT WHERE and SELECT JOIN Queries on a GPU

This paper presents implementations of a few selected SQL operations using the CUDA programming framework on the GPU platform. Nowadays, the GPU’s parallel architectures give a high speed-up on certain problems. Therefore, the number of non-graphical problems that can be run and sped-up on the GPU still increases. Especially, there has been a lot of […]
Jun, 30

HadoopCL: MapReduce on Distributed Heterogeneous Platforms Through Seamless Integration of Hadoop and OpenCL

As the scale of high performance computing systems grows, three main challenges arise: the programmability, reliability, and energy efficiency of those systems. Accomplishing all three without sacrificing performance requires a rethinking of legacy distributed programming models and homogeneous clusters. In this work, we integrate Hadoop MapReduce with OpenCL to enable the use of heterogeneous processors […]
Jun, 30

Intel Xeon Phi Coprocessor High-Performance Programming

This book is useful even before you ever touch a system with an Intel Xeon Phi coprocessor. To ensure that your applications run at maximum efficiency, the authors emphasize key techniques for programming any modern parallel computing system whether based on Intel Xeon processors, Intel Xeon Phi coprocessors, or other high performance microprocessors. Applying these […]
Jun, 30

Best Practice Guide – Intel Xeon Phi

This best practice guide provides information about Intel’s MIC architecture and programming models for the Intel Xeon Phi coprocessor in order to enable programmers to achieve good performance of their applications. The guide covers a wide range of topics from the description of the hardware of the Intel Xeon Phi coprocessor through information about the […]
Jun, 29

A model of dynamic compilation for heterogeneous compute platforms

Trends in computer engineering place renewed emphasis on increasing parallelism and heterogeneity. The rise of parallelism adds an additional dimension to the challenge of portability, as different processors support different notions of parallelism, whether vector parallelism executing in a few threads on multicore CPUs or large-scale thread hierarchies on GPUs. Thus, software experiences obstacles to […]
Jun, 29

Adaptation of algorithms for underwater sonar data processing to GPU-based systems

In this master thesis, algorithms for acoustic simulations in underwater environments are ported for GPU processing. The GPU parallel computing platforms used are CUDA, OpenCL and SkePU. The purpose of this master thesis is to adapt and evaluate the ported algorithms’ performance on two modern NVIDIA GPUs, Tesla K20 and Quadro K5000. Several optimizations, described […]

Recent source codes

* * *

* * *

HGPU group © 2010-2025 hgpu.org

All rights belong to the respective authors

Contact us: