10802

Posts

Oct, 21

QCDGPU: open-source package for Monte Carlo lattice simulations on OpenCL-compatible multi-GPU systems

The multi-GPU open-source package QCDGPU for lattice Monte Carlo simulations of pure SU(N) gluodynamics in external magnetic field at finite temperature and O(N) model is developed. The code is implemented in OpenCL, tested on AMD and NVIDIA GPUs, AMD and Intel CPUs and may run on other OpenCL-compatible devices. The package contains minimal external library […]
Oct, 21

An OpenCL-based Implementation of H.264 Encoder

We present an accelerated implementation of high-speed video stream encoder for the H.264 digital video codec standard. Based on the parallel processing techniques with GPU’s, we used an OpenCL-based GPU kernel programs. We achieved a high-level CPU-GPU interoperability, through making CPU perform all input/output operations and overall stream control, while GPU does the core encoding […]
Oct, 21

Solving Multiple Queries through a Permutation Index in GPU

Query-by-content by means of similarity search is a fundamental operation for applications that deal with multimedia data. For this kind of query it is meaningless to look for elements exactly equal to the one given as query. Instead, we need to measure dissimilarity between the query object and each database object. The metric space model […]
Oct, 21

Concurrent kernel execution on Graphic Processing Units

General Purpose Graphic Processing Unit (GPGPU) are now used in high performance computing (HPC) for their massively parallel computing aspect and capabilities. Those devices integrate hundreds of computing unit (computing core). Usually, such a level of parallelism is used to solve simulation problems (heat transfer, …) because of the numerical representation of simulated environment (matrices). […]
Oct, 21

Moim: A Multi-GPU MapReduce Framework

MapReduce greatly decrease the complexity of developing applications for parallel data processing. To considerably improve the performance of MapReduce applications, we design a new MapReduce framework, called Moim, which 1) effectively utilizes both CPUs and GPUs (general purpose Graphics Processing Units), 2) overlaps CPU and GPU computations, 3) enhances load balancing in the map and […]
Oct, 21

Understanding the Costs of Many-Task Computing Workloads on Intel Xeon Phi Coprocessors

Many-Task Computing (MTC) aims to bridge the gap between HPC and HTC. MTC emphasizes running many computational tasks over a short period of time, where tasks can be either dependent or independent of one another. MTC has been well supported on Clouds, Grids, and Supercomputers on traditional computing architectures, but the abundance of hybrid large-scale […]
Oct, 21

Optimization of the HEFT algorithm for a CPU-GPU environment

Scheduling applications efficiently on a network of computing systems is crucial for high performance. This problem is known to be NP-Hard and is further complicated when applied to a CPU-GPU heterogeneous environment. Heuristic algorithms like Heterogeneous Earliest Finish Time (HEFT) have shown to produce good results for other heterogeneous environments like Grids and Clusters. In […]
Oct, 21

Energy Efficiency Studies of Mont Blanc Applications

In this thesis, the performance and energy efficiency of four different implementations of matrix multiplication, written in OmpSs and OpenCL, is tested and evaluated. The benchmarking is done using an Intel Ivy Bridge Core i7 3770K. The results are evaluated and discussed with regards to different optimization configurations, like vectorization and multi-threading. Energy measurements are […]
Oct, 21

Work Efficient Parallel Algorithms for Large Graph Exploration

Graph algorithms play a prominent role in several fields of sciences and engineering. Notable among them are graph traversal, finding the connected components of a graph, and computing shortest paths. There are several efficient implementations of the above problems on a variety of modern multiprocessor architectures. It can be noticed in recent times that the […]
Oct, 21

Architecture-and Workload-Aware Heterogeneous Algorithms for Sparse Matrix Vector Multiplication

Multiplying a sparse matrix with a vector, denoted spmv, is a fundamental operation in linear algebra with several applications. Hence, efficient and scalable implementation of spmv has been a topic of immense research. Recent efforts are aimed at implementations on GPUs, multicore architectures, and such emerging computational platforms. Owing to the highly irregular nature of […]
Oct, 19

Multi-Scale Scheduling Techniques for Signal Processing Systems

A variety of hardware platforms for signal processing has emerged, from distributed systems such as Wireless Sensor Networks (WSNs) to parallel systems such as Multicore Programmable Digital Signal Processors (PDSPs), Multicore General Purpose Processors (GPPs), and Graphics Processing Units (GPUs) to heterogeneous combinations of parallel and distributed devices. When a signal processing application is implemented […]
Oct, 19

FlexGrip: A Soft GPGPU for FPGAs

Over the past decade, soft microprocessors and vector processors have been extensively used in FPGAs for a wide variety of applications. However, it is difficult to straightforwardly extend their functionality to support conditional and thread-based execution characteristic of general-purpose graphics processing units (GPGPUs) without recompiling FPGA hardware for each application. In this paper, we describe […]

Recent source codes

* * *

* * *

HGPU group © 2010-2025 hgpu.org

All rights belong to the respective authors

Contact us:

contact@hpgu.org