11160

Posts

Dec, 23

Single Server Multi-GPU Training of ConvNets

In this work we evaluate different approaches to parallelize computation of convolutional neural networks across several GPUs within the same server.
Dec, 23

Fast Training of Convolutional Networks through FFTs

Convolutional networks are one of the most widely employed architectures in computer vision and machine learning. In order to leverage their ability to learn complex functions, large amounts of data are required for training. Training a large convolutional network to produce state-of-the-art results can take weeks, even when using modern GPUs. Producing labels using a […]
Dec, 22

Resource Centered Computing delivering high parallel performance

Modern parallel programming requires a combination of different paradigms, expertise and tuning, that correspond to the different levels in today’s hierarchical architectures. To cope with the inherent difficulty, ORWL (ordered read-write locks) presents a new paradigm and toolbox centered around local or remote resources, such as data, processors or accelerators. ORWL programmers describe their computation […]
Dec, 22

Energy Auto-tuning using the Polyhedral Approach

As the HPC community moves into the exascale computing era, application energy has become a big concern. Tuning for energy will be essential in the effort to overcome the limited power envelope. How is tuning for lower energy related to tuning for faster execution? Understanding that relationship can guide both performance and energy tuning for […]
Dec, 22

Speed-Up Improvement Using Parallel Approach in Image Steganography

This paper presents a parallel approach to improve the time complexity problem associated with sequential algorithms. An image steganography algorithm in transform domain is considered for implementation. Image steganography is a technique to hide secret message in an image. With the parallel implementation, large message can be hidden in large image since it does not […]
Dec, 22

Numerical Simulation for the MHD System in 2D Using OpenCL

In this work we compute the MHD equations with divergence cleaning on GPU. The method is based on the finite volume approach and Strang dimensional splitting. The simplicity of the approach makes it a good candidate for a GPU implementation with OpenCL. With adequate memory optimization access, we achieve very high speedups, compared to a […]
Dec, 22

Accelerating Pairwise DNA Sequence Alignment using the CUDA Compatible GPU

We present a novel implementation of the pairwise DNA sequence alignment problem other than the Dynamic programming solution presented by Smith Waterman Algorithm. The proposed implementation uses CUDA; the parallel computing platform and programming model invented by NVIDIA. The main idea of the proposed implementation is assigning different nucleotide weights then merging the sub-sequences of […]
Dec, 21

Architecting an LTE Base Station with Graphics Processing Units

Due to the rapid growth of mobile communication, wireless base stations are becoming a significant consumer of computational resources. Historically, base stations have been built from ASICs, DSP processors, or FPGAs. This paper studies the feasibility of building wireless base stations from commercial graphics processing units (GPUs). GPUs are attractive because they are widely used […]
Dec, 21

GPU Accelerated Graph SLAM and Occupancy Voxel Based ICP For Encoder-Free Mobile Robots

Learning a map of an unknown environment and localising a robot in it is a common problem in robotics, with solutions usually requiring an estimate of the robot’s motion. In scenarios such as Urban Search and Rescue, motion encoders can be highly inaccurate, and weight and battery requirements often limit computing power. We have developed […]
Dec, 21

Parallel Compact Genetic Algorithm on CUDA-C Platform

This paper deals about the parallel implementation of the compact Genetic Algorithm on the Compute Unified Device Architecture (CUDA) platform of GPU. We elaborate implementation details on the parallel platform.
Dec, 21

Videogame Graphics, BigData & Analytics

The purpose of this coffee shop read is to attempt to highlight the criticality of videogames as a component of the "Convergence" of some amazing technologies (in particular: Cloud, Gaming/MMOG, Gamification and BigData) that is clear to many inside the IT world. I am not a deep technical "guru" I am a businessman that seeks […]
Dec, 21

Design and Optimization of OpenFOAM-based CFD Applications for Modern Hybrid and Heterogeneous HPC Platforms

The progress of high performance computing platforms is dramatic, and most of the simulations carried out on these platforms, result in improvements on one level, yet exposes shortcomings of the current CFD packages capabilities. Therefore, hardware-aware design and optimizations are crucial towards exploiting the modern computing resources. This thesis proposes optimizations aimed at acceleration numerical […]

* * *

* * *

HGPU group © 2010-2024 hgpu.org

All rights belong to the respective authors

Contact us: