Posts
May, 7
Simulation of earthquake sloshing loads in a nuclear reactor
Modelling of sloshing flow inside a Lead-cooled Fast Nuclear Reactor during an earthquake is conducted, focusing on the evaluation of the loads caused by the fluid on the structure. AQUAgpusph, a free software OpenCL accelerated SPH code has been used. This tool is analysed, including the performance comparison with some available GPU accelerated SPH codes, […]
May, 7
Learning Sparse Recurrent Neural Networks in Language Modeling
In the context of statistical language modeling, we explored the task of learning an Elman network with sparse weight matrices, as a pilot study towards learning a sparsely connected fully recurrent neural network, which would be potentially useful in many cases. We also explored how efficient and scalable it can be in practice. In particular, […]
May, 7
Evolution of a double-front Rayleigh-Taylor system using a GPU-based high resolution thermal Lattice-Boltzmann model
We study the turbulent evolution originated from a system subjected to a Rayleigh-Taylor instability with a double density at high resolution in a 2 dimensional geometry using a highly optimized thermal Lattice Boltzmann code for GPUs. The novelty of our investigation stems from the initial condition, given by the superposition of three layers with three […]
May, 7
Scaling Performance of FFT Computation on an Industrial Integrated GPU Co-processor: Experiments with Algorithm Adaptation
Recent Intel processors (IvyBridge, Haswell) contain an embedded on-chip GPU unit, in addition to the main CPU processor. In this work we consider the issue of efficiently mapping Fast Fourier Transform computation onto such coprocessor units. To achieve this we pursue three goals: First, we want to study half-systematic ways to adjust the actual variant […]
May, 7
Improving the Programmability of GPU Architectures
Throughout the past decades, the tremendous growth of single-core performance has been the key-enabler for digital technology to become ubiquitous in our society. Recently, diminishing returns on Dennard scaling resulted in power dissipation issues, leading to reduced performance growth. Performance growth has since been re-enabled by multi-core processors as well as by exploiting the energy […]
May, 7
Orchestrating Thread Scheduling and Cache Management to Improve Memory System Throughput in Throughput Processors
Throughput processors such as GPUs continue to provide higher peak arithmetic capability. Designing a high throughput memory system to keep the computational units busy is very challenging. Future throughput processors must continue to exploit data locality and utilize the on-chip and off-chip resources in the memory system more effectively to further improve the memory system […]
May, 7
Bio-Inspired Optimization of Ultra-Wideband Patch Antennas Using Graphics Processing Unit Acceleration
Ultra-wideband (UWB) wireless systems have recently gained considerable attention as effective communications platforms with the properties of low power and high data rates. Applications of UWB such as wireless USB put size constraints on the antenna, however, which can be very difficult to meet using typical narrow band antenna designs. The aim of this thesis […]
May, 7
Parallel Solving Massive Linear Equations with CUDA
By consulting the state-of-the-art methods on massive linear equations solving and parallel computing, the main issue of calculation have been extracted from finite element method. The author test some solving routines on the CPU based as well as design and implement on GPU by using CUDA. The coalesced access result on GPU shows a ten […]
May, 7
Simultaneous Use of CPU and GPU to Real Time Inverted Index Updating in Microblogs
Nowadays, with attention to developing the different data networks, the wide masses of data are producing and updating continually. Managing the great data enumerate the fundamental challenges in data mining. One of the considered main subjects in this context is how searching among the wide masses of data. Therefore, require to producing the typical powerful, […]
May, 6
Accelerating Cryptosystems on Hardware Platforms
In the past decade, one of the major breakthroughs in computer science theory is the first construction of fully homomorphic encryption (FHE) scheme introduced by Gentry. Using a FHE one may perform an arbitrary numbers of computations directly on the encrypted data without revealing of the secret key. Therefore, a practical FHE provides an invaluable […]
May, 6
GPU-Accelerated Joint 1D and 2D Barcode Localization on Smartphones
The built-in cameras and powerful processors have turned smartphones into ubiquitous barcode scanners. In smartphone-based barcode scanning, barcode localization is an important preprocessing step that quickly scans the entire camera image and passes barcode candidates to the actual decoder. This paper presents the implementation steps of a robust joint 1D and 2D barcode localization algorithm […]
May, 6
Implementing an efficient method of check-pointing on CPU-GPU
In this paper, we describe the design, implementation, verification and analysis of providing fine-grained architectural support for efficient check-pointing and restart on a CPU-GPU heterogeneous system. We use Multi2sim, a simulator, capable of emulating a CPU-GPU system. The simulator is capable of emulating a 32 bit x86 CPU that launches OpenCl Kernels on the GPU […]