14330

Posts

Jul, 29

Performance Analysis of a Particle-in-Cell Plasma Physics Code on Homogeneous and Heterogeneous HPC Systems

PIC methods are one of the most used methods in plasma simulations. We present a comprehensible evaluation of the PIC code performance on four current parallel platforms: IBM PowerPC, Intel Nehalem (SMP), Intel Sandy Bridge (SMP) and ARM GPU. The behavior of computational algorithms and data structures are analyzed to deduce which code optimizations will […]
Jul, 29

Sound Synthesis Using Physical Modeling on Heterogeneous Computing Platforms

The paper presents a comparison of central processing unit (CPU) and graphics processing unit (GPU) performance in sound synthesis based on physical modeling. The goal was to achieve real-time performance with two- and three-dimensional finite difference (FD) instrument models. Two abstract instruments, a membrane and a block, were modeled and tested using a CPU and […]
Jul, 29

Prospects of GPGPU in the Auger Offline Software Framework

The Pierre Auger Observatory is the currently largest experiment dedicated to unveil the nature and origin of the highest energetic cosmic rays. The software framework ‘Offline’ has been developed by the Pierre Auger Collaboration for joint analysis of data from different independent detector systems used in one observatory. While reconstruction modules are specific to the […]
Jul, 29

Reverberant speech recognition combining deep neural networks and deep autoencoders augmented with a phone-class feature

We propose an approach to reverberant speech recognition adopting deep learning in the front-end as well as back-end of a reverberant speech recognition system, and a novel method to improve the dereverberation performance of the front-end network using phone-class information. At the front-end, we adopt a deep autoencoder (DAE) for enhancing the speech feature parameters, […]
Jul, 29

OKL: A Unified Language for Parallel Architectures

Rapid evolution of computer processor architectures has spawned multiple programming languages and standards. This thesis strives to address the challenges caused by fast and cyclical changes in programming models. The novel contribution of this thesis is the introduction of an abstract unified framework which addresses portability and performance for programming manycore devices. To test this […]
Jul, 28

Optimization of a finite element code implemented in MATLAB: On the use of GPUs for High Performance Computing

The Department of Mechanical and Materials Engineering has developed a 2D Finite Element code based on geometry independent Cartesian grids (cgFEM) capable of solving shape optimization problems as well as making patientspecific analyses using medical images. A similar code in 3D (FEAVox) is currently under development. Both codes are implemented in MATLAB, a simple and […]
Jul, 28

Experiences in Speeding Up Computer Vision Applications on Mobile Computing Platforms

Computer vision (CV) is widely expected to be the next big thing in mobile computing. The availability of a camera and a large number of sensors in mobile devices will enable CV applications that understand the environment and enhance people’s lives through augmented reality. One of the problems yet to solve is how to transfer […]
Jul, 27

Processing Large-scale XML Files on GPGPU Cluster

XML has been used as a textual data format for transporting and storing information in many areas. However, the cost to process the large-scale XML file will become a serious issue for general processing methods. In this paper, we propose a design and implementation of a large-scale XML processing system on GPU cluster to address […]
Jul, 27

Real-time Ray tracing and Editing of Large Voxel Scenes

A novel approach is presented to render large voxel scenes in real-time. The approach differs from existing solutions in that a large emphasis is put on allowing the user to edit and stream large datasets. Previous solutions often use compression schemes involving hierarchical data layouts such as sparse voxel octrees that require some form of […]
Jul, 27

Fast-Coding Robust Motion Estimation Model in a GPU

Nowadays vision systems are used with countless purposes. Moreover, the motion estimation is a discipline that allow to extract relevant information as pattern segmentation, 3D structure or tracking objects. However, the real-time requirements in most applications has limited its consolidation, considering the adoption of high performance systems to meet response times. With the emergence of […]
Jul, 27

An efficient KNN algorithm implemented on FPGA based heterogeneous computing system using OpenCL

Accurate and efficient data classification techniques are of vital importance to many problems, and are rapidly developing in recent decades. K-Nearest Neighbor algorithm (KNN), as one of the most important algorithms, is widely used in text categorization, predictive analysis, data mining and image recognition, etc. To accelerate the algorithm and to optimize the parallel implementation […]
Jul, 27

Irregular algorithms on the Xeon Phi

The Xeon Phi is a coprocessor first released in 2012 by Intel. With x86 instruction set support, 60 cores and up to 2 teraflops of single-precision performance, the Xeon Phi seems promising and has gained wide interest. The world’s fastest supercomputer to date, the Tianhe-2, features the Xeon Phi, so does the recently announced 180 […]

Recent source codes

* * *

* * *

HGPU group © 2010-2025 hgpu.org

All rights belong to the respective authors

Contact us: