13088

Posts

Nov, 9

Relax-Miracle: GPU Parallelization of Semi-Analytic Fourier-Domain solvers for Earthquake Modeling

Effective utilization of GPU processing capacity for scientific workloads is often limited by memory throughput and PCIe communication transfer times. This is particularly true for semi-analytic Fourier-domain computations in earthquake modeling (Relax) where operations on large-scale 3D data structures can require moving large volumes of data from storage to the compute in predictable but orthogonal […]
Nov, 9

Parallel FIM Approach on GPU using OpenCL

In this paper, we describe GPU-Eclat algorithm, a GPU (General Purpose Graphics Processing Unit) enhanced implementation of Frequent Item set Mining (FIM). The frequent itemsets are extracted from a transactional database as it is a essential assignment in data mining field because of its broad applications in mining association rules, time series, correlations etc. The […]
Nov, 9

Dogwild! – Distributed Hogwild for CPU & GPU

Deep learning has enjoyed tremendous success in recent years. Unfortunately, training large models can be very time consuming, even on GPU hardware. We describe a set of extensions to the state of the art Caffe library [3], allowing training on multiple threads and GPUs, and across multiple machines. Our focus is on architecture, implementing asynchronous […]
Nov, 5

Graphics Processing Unit-Based Computer-Aided Design Algorithms for Electronic Design Automation

This dissertation presents research focusing on reshaping the design paradigm of electronic design automation (EDA) applications to embrace the computational throughput of a massively parallel computing architecture. The EDA industry has gone through major evolution in algorithm designs over the past several decades, delivering improved and more sophisticated design tools. Today, these tools provide a […]
Nov, 5

GPU Acceleration of k-Nearest Neighbor Search in Face Classifier based on Eigenfaces

Face recognition is a specialized case of object recognition, and has broad applications in security, surveillance, identity management, law enforcement, human-computer interaction, and automatic photo and video indexing. Because human faces occupy a narrow portion of the total image space, specialized methods are required to identify faces based on subtle differences. One such method is […]
Nov, 5

Parallelization techniques of the x264 video encoder

Higher video quality is demanded by the users of any kind of video stream service, including web applications, High Definition broadcast terrestrial services, etc. All of those video streams are encoded first using a compression format, one of them is H.264/MPEG-4 AVC. The main issue is that the better the quality of the video the […]
Nov, 5

Highly optimized simulations on single- and multi-GPU systems of 3D Ising spin glass

We present a highly optimized implementation of a Monte Carlo (MC) simulator for the three-dimensional Ising spin-glass model with bimodal disorder, i.e., the 3D Edwards-Anderson model running on CUDA enabled GPUs. Multi-GPU systems exchange data by means of the Message Passing Interface (MPI). The chosen MC dynamics is the classic Metropolis one, which is purely […]
Nov, 5

A GPU-Based Wide-Band Radio Spectrometer

The Graphics Processing Unit (GPU) has become an integral part of astronomical instrumentation, enabling high-performance online data reduction and accelerated online signal processing. In this paper, we describe a wide-band reconfigurable spectrometer built using an off-the-shelf GPU card. This spectrometer, when configured as a polyphase filter bank (PFB), supports a dual-polarization bandwidth of up to […]
Nov, 3

Profiling of Data-Parallel Processors

Profiling data can help to improve an application with respect to various objectives like execution time, energy consumption or even thermal sensor placement for an upcoming device. This survey reviews state-of-the-art profiling tools for dataparallel processors like Nsight, PAPI and TAU as well as Lynx. Additionally, the attained knowledge is utilized to detect the bottleneck […]
Nov, 3

A Fast Poisson Solver with Periodic Boundary Conditions for GPU Clusters in Various Configurations

Fast Poisson solvers using the Fast Fourier Transform on uniform grids are especially suited for parallel implementation, making them appropriate for portability on graphical processing unit (GPU) devices. The goal of the following work was to implement, test, and evaluate a fast Poisson solver for periodic boundary conditions for use on a variety of GPU […]
Nov, 3

Bounds on the Energy Consumption of Computational Kernels

As computing devices evolve with successive technology generations, many machines target either the mobile or high-performance computing/datacenter environments. In both of these form factors, energy consumption often represents the limiting factor on hardware and software efficiency. On mobile devices, limitations in battery technology may reduce possible hardware capability due to a tight energy budget. On […]
Nov, 3

A GPU-based Framework for Real-time Free Viewpoint Television

Thesis addresses two main problems of Free Viewpoint TV: generation of arbitrary viewpoint in real-time and its delivery to end-user. For the first problem a GPU-based algorithm capable of generating free viewpoints from a network of fixed HD video cameras was developed. We used a space-sweep algorithm to estimate depth information. The view generation sub-system […]

* * *

* * *

HGPU group © 2010-2025 hgpu.org

All rights belong to the respective authors

Contact us: