7010

Posts

Jan, 16

Theano: Deep Learning on GPUs with Python

In this paper, we present Theano, a framework in the Python programming language for defining, optimizing and evaluating expressions involving high-level operations on tensors. Theano offers most of NumPy’s functionality, but adds automatic symbolic differentiation, GPU support, and faster expression evaluation. Theano is a general mathematical tool, but it was developed with the goal of […]
Jan, 16

A Fast Jet Finder Algorithm Using Graphic Processing Unit

A collimated emission of hadrons usually called Jet is the experimental counterparts of the partons (quarks and gluons) which are not observed separately. The CMS detector at LHC is ideally designed to study jet tomography which is an important probe to investigate the hot and dense medium formed during the heavy ion collisions. Although CMS […]
Jan, 16

Data Triage and Visual Analytics for Scientific Visualization

As the speed of computers continues to increase at a very fast rate, the size of data generated from scientific simulations has now reached petabytes ($10^{12}$ bytes) and beyond. Under such circumstances, no existing techniques can be used to perform effective data analysis at a full precision. To analyze large scale data sets, visual analytics […]
Jan, 16

Feature Extraction and Visualization from Higher-Order CFD Data

Computational fluid dynamics (CFD) methods have been employed in the studies of subjects such as aeroacoustics, gas dynamics, turbo machinery, viscoelastic fluids, among others. However, the need for accuracy and high performance resulted in methods whose solutions are becoming increasingly more complex. In this context, feature extraction and visualization methods play a key role, making […]
Jan, 16

Parallel AES Encryption Engines for Many-Core Processor Arrays

By exploring different granularities of data-level and task-level parallelism, we map 16 implementations of an Advanced Encryption Standard (AES) encipher with both online and offline key expansion on a fine-grained many-core system. The smallest design utilizes only 6 cores for offline key expansion and 8 cores for online key expansion, while the largest requires 107 […]
Jan, 16

Nested Intervals Tree Encoding with System of Residual Classes

This paper describes one of ways to represent tree-like structures in Data Bases. Authors suggest to expand V. Tropashko nested intervals model. Numbers in intervals are described in system of residual classes. It allows to avoid storage of big numbers and easily implemented by parallel-programming algorithms.
Jan, 15

Spectral Method Characterization on FPGA and GPU Accelerators

As CPU clock frequencies plateau and the doubling of CPU cores per processor exacerbate the memory wall, hybrid core computing, utilizing CPUs augmented with FPGAs and/or GPUs holds the promise of addressing highperformance computing demands, particularly with respect to performance, power and productivity. This paper compares the sustained performance of a complex, single precision, floating-point, […]
Jan, 15

CPU-GPU hybrid accelerating the Zuker algorithm for RNA secondary structure prediction application

BACKGROUND: Prediction of ribonucleic acid (RNA) secondary structure remains one of the most important research areas in bioinformatics. The Zuker algorithm is one of the most popular methods of free energy minimization for RNA secondary structure prediction. Thus far, few studies have been reported on the acceleration of the Zuker algorithm on general-purpose processors or […]
Jan, 15

Enhancing Performance for Solving Finite Element Mesh using Heterogeneous Platforms

Finite element methods (FEM) are most widely used for simulation of structural dynamics problems. Due to their highly compute intensive nature, these methods are used with domain decomposition where the problem is divided into subdomains which are individually solved and coupled together to obtain the final solution. One of the latest and most efficient approach […]
Jan, 15

Declarative Parallel Programming for GPUs

The recent rise in the popularity of Graphics Processing Units (GPUs) has been fueled by software frameworks, such as NVIDIA’s Compute Unified Device Architecture (CUDA) and Khronos Group’s OpenCL that make GPUs available for general purpose computing. However, CUDA and OpenCL are still lowlevel approaches that require users to handle details about data layout and […]
Jan, 15

A visibility-based approach for occupancy grid computation in disparity space

Occupancy grids are a very convenient tool for environment representation in robotics. This paper will detail a novel approach to compute occupancy grids from stereo-vision, and shows its application for the field of intelligent vehicles. In the proposed approach, occupancy is initially computed directly in the stereoscopic sensor’s disparity space. The calculation formally accounts for […]
Jan, 14

Linear Algebra Algorithms for Hybrid Architectures with XKaapi

The emergence and continuing use of multicore architectures with GPU accelerators require changes in the current software to address the gap between the accelerators’ computer vs the CPU-GPU communication speed. We describe how to develop linear algebra algorithms for these new and emerging hybrid architectures using XKaapi.

* * *

* * *

HGPU group © 2010-2025 hgpu.org

All rights belong to the respective authors

Contact us: