13201

Posts

Dec, 1

Application Synthesis and Optimization on Heterogeneous Parallel Processing Systems

Recently, a hybrid system consisting of general-purpose processors (CPU) and accelerators such as graphic processing units (GPUs) have become mainstream system architecture design for achieving high performance and power efficiency. However, this growing trend is forcing programmers to address issues and challenges in adapting legacy serial programs into heterogeneous parallel programs. To alleviate the burden […]
Dec, 1

Performance Analysis of a High-level Abstractions-based Hydrocode on Future Computing Systems

In this paper we present research on applying a domain specific high-level abstractions (HLA) development strategy with the aim to "future-proof" a key class of high performance computing (HPC) applications that simulate hydrodynamics computations at AWE plc. We build on an existing high-level abstraction framework, OPS, that is being developed for the solution of multi-block […]
Dec, 1

pyMIC: A Python Offload Module for the Intel Xeon Phi Coprocessor

Python has gained a lot of attention by the high performance computing community as an easy-to-use, elegant scripting language for rapid prototyping and development of flexible software. At the same time, there is an ever-growing need for more compute power to satisfy the demand for higher accuracy simulation or more detailed modeling. The Intel Xeon […]
Nov, 29

A Framework for Composing High-Performance OpenCL from Python Descriptions

Parallel processors have become ubiquitous; most programmers today have access to parallel hardware such as multi-core processors and graphics processors. This has created an implementation gap, where efficiency programmers with knowledge of hardware details can attain high performance by exploiting parallel hardware, while productivity programmers with application-level knowledge may not understand low-level performance trade-offs. Ideally, […]
Nov, 29

Code Generation Compiler for the OpenMP 4.0 Accelerator Model onto OMPSS

The aim of OpenMP which is a well known shared memory programming API, is using shared memory multiprocessor programming with pragma directives easily. Up till now, its interface consisted of task and iteration level parallelism for general purpose CPU. However OpenMP includes in its latest 4.0 specification the accelerator model. OmpSs is an OpenMP extended […]
Nov, 29

A CUDA implementation of the High Performance Conjugate Gradient benchmark

The High Performance Conjugate Gradient (HPCG) benchmark has been recently proposed as a complement to the High Performance Linpack (HPL) benchmark currently used to rank supercomputers in the Top500 list. This new benchmark solves a large sparse linear system using a multigrid preconditioned conjugate gradient (PCG) algorithm. The PCG algorithm contains the computational and communication […]
Nov, 29

Runtime Comparison of CPU and GPU Using Portable Programming Models

Since increasing clock speeds are not enough to speed up computation, there exist several alternative options. One of them is parallelism. For some problems it is possible to use the graphics processor as a massive parallel system and gain high speedups. Since NVIDIA introduced the unified device architecture and AMD switched to the OpenCL programming […]
Nov, 29

Parallel kNN on GPU Architecture Using OpenCL

In data mining applications, one of the useful algorithms for classification is the kNN algorithm. The kNN search has a wide usage in many research and industrial domains like 3-dimensional object rendering, content-based image retrieval, statistics, biology (gene classification), etc. In spite of some improvements in the last decades, the computation time required by the […]
Nov, 25

4th International Conference on Software and Computer Applications, ICSCA 2015

Submission Deadline: 2015-04-10 Topics: Software Engineering AI and Knowledge based software engineering Artificial Intelligence Aspect-orientation and feature interaction Business Process Reengineering & Science Communication Systems and Networks Component-Based Software Engineering Computer & Software Engineering Computer Animation and Design Contents Computer Game Development, User Modeling and Management Computer supported cooperative work Cost Modeling and Analysis Data […]
Nov, 25

Improving GPU Performance by Regrouping CPU-Memory Data

In order to fast effective analysis of large complex systems, high-performance computing is essential. NVIDIA Compute Unified Device Architecture (CUDA)-assisted central processing unit (CPU) / graphics processing unit (GPU) computing platform has proven its potential to be used in high-performance computing. In CPU/GPU computing, original data and instructions are copied from CPU main memory to […]
Nov, 25

A Self-Optimizing Framework for Developing Metrology Software on Massive Parallel Processor Architectures

Standard PC hardware rapidly increases in parallel computing power in form of multicore CPUs and general purpose GPUs. To take advantage of this situation it is necessary to create specialized code. This is a very time consuming and therefore an expensive task. One approach on solving this problem is the OpenCL (Open Computing Language) standard. […]
Nov, 25

Anisotropic interfacial tension, contact angles, and line tensions: A graphics-processing-unit-based Monte Carlo study of the Ising model

As a generic example for crystals where the crystal-fluid interface tension depends on the orientation of the interface relative to the crystal lattice axes, the nearest neighbor Ising model on the simple cubic lattice is studied over a wide temperature range, both above and below the roughening transition temperature. Using a thin film geometry $L_x […]

* * *

* * *

HGPU group © 2010-2025 hgpu.org

All rights belong to the respective authors

Contact us:

contact@hpgu.org