10612

Posts

Sep, 25

OpenCL Task Partitioning in the Presence of GPU Contention

Heterogeneous multi- and many-core systems are increasingly prevalent in the desktop and mobile domains. On these systems it is common for programs to compete with co-running programs for resources. While multi-task scheduling for CPUs is a well-studied area, how to partitioning and map computing tasks onto the hetergeneous system in the presence of GPU contention […]
Sep, 25

Large Graphs on multi-GPUs

We studied the problem of developing an efficient BFS algorithm to explore large graphs having billions of nodes and edges. The size of the problem requires a parallel computing architecture. We proposed a new algorithm that performs a distributed BFS and the corresponding implementation on multiGPUs clusters. As far as we know, this is the […]
Sep, 25

Automatic Software Synthesis from High-Level ForSyDe Models Targeting Massively Parallel Processors

In the past decade we have witnessed an abrupt shift to parallel computing subsequent to the increasing demand for performance and functionality that can no longer be satisfied by conventional paradigms. As a consequence, the abstraction gab between the applications and the underlying hardware increased, triggering both industry and academia in several research directions. This […]
Sep, 25

Fast k-NNG construction with GPU-based quick multi-select

In this paper we describe a new brute force algorithm for building the k-Nearest Neighbor Graph (k-NNG). The k-NNG algorithm has many applications in areas such as machine learning, bioinformatics, and clustering analysis. While there are very efficient algorithms for data of low dimensions, for high dimensional data the brute force search is the best […]
Sep, 25

The density matrix renormalization group algorithm on kilo-processor architectures: implementation and trade-offs

In the numerical analysis of strongly correlated quantum lattice models one of the leading algorithms developed to balance the size of the effective Hilbert space and the accuracy of the simulation is the density matrix renormalization group (DMRG) algorithm, in which the run-time is dominated by the iterative diagonalization of the Hamilton operator. As the […]
Sep, 24

Evaluation of autoparallelization toolkits for commodity graphics hardware

In this paper we evaluate the performance of the OpenACC and Mint toolkits against C and CUDA implementations of the standard PolyBench test suite. Our analysis reveals that performance is similar in many cases, but that a certain set of code constructs impede the ability of Mint to generate optimal code. We then present some […]
Sep, 24

Pipeline strategies to accelerate range query processing on a multi-GPU environment

Nowadays, similarity search is becoming a field of increasing interest because these kinds of methods can be applied to different areas in computer science and engineering, such as voice and image recognition, text retrieval, and many others. However, when processing large volumes of data, query response time can be quite high. In this case, it […]
Sep, 24

Realtime Deformation of Constrained Meshes Using GPU

Constrained meshes play an important role in freeform architectural design, as they can represent panel layouts on freeform surfaces. It is challenging to perform realtime manipulation on such meshes, because all constraints need to be respected during the deformation while the shape quality needs to be maintained. This usually leads to nonlinear constrained optimization problems, […]
Sep, 24

Parallel multi-agent path planning in dynamic environments for real-time applications

Current pathplanning algorithms are not efficient enough to provide optimal pathplanning in dynamic environments for a large number of agents in real time. Furthermore, there are no real-time algorithms that fully use the potential of parallelism. The goal of this thesis is to find a basis for such an algorithm. Based on the literature study, […]
Sep, 24

GPU Based Massive Parallel Kawasaki Kinetics In Monte Carlo Modelling of Lipid Microdomains

This paper introduces novel method of simulation of lipid biomembranes based on Metropolis Hastings algorithm and Graphic Processing Unit computational power. Method gives up to 55 times computational boost in comparison to classical computations. Extensive study of algorithm correctness is provided. Analysis of simulation results and results obtained with classical simulation methodologies are presented.
Sep, 23

Performance of OpenCL

OpenCL is a relatively new standard that supports computation on a variety of parallel architectures. The author was unable to find reliable information about performance of OpenCL programs on CPU’s in comparison to traditional parallel processing standards like OpenMP. This paper describes the results of an experiment that tries to answer the following question: "Which […]
Sep, 23

Multi-GPU Acceleration of Black-Scholes Equation based Option Pricing

In high-frequency trading of option, "milliseconds earn or lose millions", the computational speed of predicting option price is the crucial factor for option traders to efficiently decide the price and evaluate the corresponding risk.Black-Scholes equation is a mathematical equation describing the option pricing over time. Multi-GPU is a recently developed platform for high-performance computing, which […]

* * *

* * *

HGPU group © 2010-2025 hgpu.org

All rights belong to the respective authors

Contact us: