14286

Posts

Jul, 20

Performance Analysis of GPU-Accelerated Filter-Based Source Finding for HI Spectral Line Image Data

Searching for sources of electromagnetic emission in spectral-line radio astronomy interferometric data is a computationally intensive process. Parallel programming techniques and High Performance Computing hardware may be used to improve the computational performance of a source finding program. However, it is desirable to further reduce the processing time of source finding in order to decrease […]
Jul, 20

Accelerating a Movie Recommender System Using VirtualCL on a Heterogeneous GPU Cluster

Present day market offers a large number of movies which overwhelm people with choices. In order to quickly navigate through all the possible movies and find the interesting ones, the user can take advantage of recommender systems for movies. This thesis studies a movie recommender system which uses image processing and computer vision algorithms. The […]
Jul, 20

Parallel Programming in Actor-Based Applications via OpenCL

GPU and multicore hardware architectures are commonly used in many different application areas to accelerate problem solutions relative to single CPU architectures. The typical approach to accessing these hardware architectures requires embedding logic into the programming language used to construct the application; the two primary forms of embedding are: calls to API routines to access […]
Jul, 20

Improving performance portability for GPU-specific OpenCL kernels on multi-core/many-core CPUs by analysis-based transformations

OpenCL is an open heterogeneous programming framework. Although OpenCL programs are functionally portable, it does not provide performance portability, so code transformation often plays an irreplaceable role. When adapting GPU-specific OpenCL kernels to run on multi-core/many-core CPUs, coarsening the thread granularity is necessary and thus extensively used. However, locality concerns exposed in GPU-specific OpenCL code […]
Jul, 20

GROMACS: High performance molecular simulations through multi-level parallelism from laptops to supercomputers

GROMACS is one of the most widely used open-source and free software codes in chemistry, used primarily for dynamical simulations of biomolecules. It provides a rich set of calculation types, preparation and analysis tools. Several advanced techniques for free-energy calculations are supported. In version 5, it reaches new performance heights, through several new and enhanced […]
Jul, 17

DeepFont: Identify Your Font from An Image

As font is one of the core design concepts, automatic font identification and similar font suggestion from an image or photo has been on the wish list of many designers. We study the Visual Font Recognition (VFR) problem, and advance the state-of-the-art remarkably by developing the DeepFont system. First of all, we build up the […]
Jul, 17

Overhauling SC atomics in C11 and OpenCL

Despite the conceptual simplicity of sequential consistency (SC), the semantics of SC atomic operations and fences in the C11 and OpenCL memory models is subtle, leading to convoluted prose descriptions that translate to complex axiomatic formalisations. We conduct an overhaul of SC atomics in C11, reducing the associated axioms in both number and complexity. A […]
Jul, 17

GPU-based visualization of domain-coloured algebraic Riemann surfaces

We examine an algorithm for the visualization of domain-coloured Riemann surfaces of plane algebraic curves. The approach faithfully reproduces the topology of the surface and also preserves some of its geometry. We discuss how the algorithm can be implemented efficiently in OpenGL with geometry shaders, and (less efficiently) even in WebGL with multiple render targets […]
Jul, 17

Scaling Monte Carlo Tree Search on Intel Xeon Phi

Many algorithms have been parallelized successfully on the Intel Xeon Phi coprocessor, especially those with regular, balanced, and predictable data access patterns and instruction flows. Irregular and unbalanced algorithms are harder to parallelize efficiently. They are, for instance, present in artificial intelligence search algorithms such as Monte Carlo Tree Search (MCTS). In this paper we […]
Jul, 17

Many-Core Architectures: Hardware-Software Optimization and Modeling Techniques

During the last few decades an unprecedented technological growth has been at the center of the embedded systems design paramount, with Moore’s Law being the leading factor of this trend. Today in fact an ever increasing number of cores can be integrated on the same die, marking the transition from state-of-the-art multi-core chips to the […]
Jul, 15

Performance analysis of GPGPU and CPU On AES Encryption

The advancements in computing have led to tremendous increase in the amount of data being generated every minute, which needs to be stored or transferred maintaining high level of security. The military and armed forces today heavily rely on computers to store huge amount of important and secret data, that holds a big deal for […]
Jul, 15

Automatic Optimization of Thread Mapping for a GPGPU Programming Framework

Although General Purpose computation on Graphics Processing Units (GPGPU) is widely used for the high-performance computing, standard programming frameworks such as CUDA and OpenCL are still difficult to use.They require low-level specifications and the hand-optimization is a large burden. Therefore we are developing an easier framework named MESI-CUDA. Based on a virtual shared memory model, […]

* * *

* * *

HGPU group © 2010-2025 hgpu.org

All rights belong to the respective authors

Contact us: