8063

Posts

Jul, 31

MCMini: Monte Carlo on GPGPU

MCMini is a proof of concept that demonstrates the possibility for Monte Carlo neutron transport using OpenCL with a focus on performance. This implementation, written in C, shows that tracing particles and calculating reactions on a 3D mesh can be done in a highly scalable fashion. These results demonstrate a potential path forward for MCNP […]
Jul, 29

High-Performance Online Spatial and Temporal Aggregations on Multi-core CPUs and Many-Core GPUs

Motivated by the practical needs for efficiently processing large-scale taxi trip data, we have developed techniques for high performance online spatial, temporal and spatiotemporal aggregations. These techniques include timestamp compression to reduce memory footprint, simple linear data structures for efficient in-memory scans and utilization of massively data parallel GPU accelerations for spatial joins. Our experiments […]
Jul, 29

Sigma*: Symbolic Learning of Stream Filters

We present Sigma*, a novel technique for learning symbolic models of software behavior. Sigma* addresses the challenge of synthesizing models of software by using symbolic conjectures and abstraction. By combining dynamic symbolic execution to discover symbolic input-output steps of the programs and counterexample guided abstraction refinement to over-approximate program behavior, Sigma* transforms arbitrary source representation […]
Jul, 29

A Novel GPU Implementation of Eigen Analysis for Risk Management

Portfolio risk is commonly defined as the standard deviation of its return. The empirical correlation matrix of asset returns in a portfolio has its intrinsic noise component. This noise is filtered for more robust performance. Eigendecomposition is a widely used method for noise filtering. Jacobi algorithm has been a popular eigensolver technique due to its […]
Jul, 29

Implementation and Evaluation of Recurrence Equation Solvers on GPGPU systems using Rearrangement of Array Configurations

The recurrence equation solver is used in many numerical applications and other general-purpose applications, but it is inherently a sequential algorithm, so it is difficult to implement the parallel program for it. Recently, GPGPU (General Purpose computing on Graphic Processing Unit) attracts a great deal of attention, which is used for generalpurpose computations like numerical […]
Jul, 29

Distributed-Shared CUDA: Virtualization of Large-Scale GPU Systems for Programmability and Reliability

One of the difficulties for current GPGPU (General-Purpose computing on Graphics Processing Units) users is writing code to use multiple GPUs. One limiting factor is that only a few GPUs can be attached to a PC, which means that MPI (Message Passing Interface) would be a common tool to use tens or more GPUs. However, […]
Jul, 28

Fast Linear Algebra on GPU

GPUs have been successfully used for acceleration of many mathematical functions and libraries. A common limitation of those libraries is the minimal size of primitives being handled, in order to achieve a significant speedup compared to their CPU versions. The minimal size requirement can prove prohibitive for many applications. It can be loosened by batching […]
Jul, 28

Ensemble K-Means on Modern Many Core Hardware

Clustering involves partitioning a set of objects into subsets called clusters so that objects in the same cluster are similar according to some metric. Clustering is widely used in many fields like machine learning, data mining, pattern recognition and bioinformatics. K-means algorithm is the most popular algorithm used for clustering which uses distance as the […]
Jul, 28

Acceleration of Variance of Color Differences-Based Demosaicing Using CUDA

Image demosaicing algorithms are used to reconstruct a full color image from the incomplete color samples output (RAW data) of an image sensor overlaid with a Color Filter Array (CFA). Better demosaicing algorithms are superior in terms of acuity, dynamic range, signal to noise ratio, and artifact suppression, which make them suitable for high quality […]
Jul, 28

Speeding up LIP-Canny with CUDA programming

The LIP-Canny algorithm outperforms traditional Canny edge detection in terms of edge detection under varying illumination. This method is based on a robust mathematical model (LIP paradigm), which is closer to the human vision system. However, this model requires more computations and more complex operations than the traditional paradigm. Non-parallel implementations of LIP-Canny do not […]
Jul, 28

Computational modeling of synthetic microbial biofilms

Microbial biofilms are complex, self-organized communities of bacteria, which employ physiological cooperation and spatial organization to increase both their metabolic efficiency and their resistance to changes in their local environment. These properties make biofilms an attractive target for engineering, particularly for the production of chemicals such as pharmaceutical ingredients or biofuels, with the potential to […]
Jul, 27

A virtual memory based runtime to support multi-tenancy in clusters with GPUs

Graphics Processing Units (GPUs) are increasingly becoming part of HPC clusters. Nevertheless, cloud computing services and resource management frameworks targeting heterogeneous clusters including GPUs are still in their infancy. Further, GPU software stacks (e.g., CUDA driver and runtime) currently provide very limited support to concurrency. In this paper, we propose a runtime system that provides […]

* * *

* * *

HGPU group © 2010-2024 hgpu.org

All rights belong to the respective authors

Contact us: