Posts
Oct, 3
A fast GPU-based Monte Carlo simulation of proton transport with detailed modeling of non-elastic interactions
Purpose: Very fast Monte Carlo (MC) simulations of proton transport have been implemented recently on GPUs. However, these usually use simplified models for non-elastic (NE) proton-nucleus interactions. Our primary goal is to build a GPU-based proton transport MC with detailed modeling of elastic and NE collisions. Methods: Using CUDA, we implemented GPU kernels for these […]
Oct, 3
A stencil-based implementation of Parareal in the C++ domain specific embedded language STELLA
In view of the rapid rise of the number of cores in modern supercomputers, time-parallel methods that introduce concurrency along the temporal axis are becoming increasingly popular. For the solution of time-dependent partial differential equations, these methods can add another direction for concurrency on top of spatial parallelization. The paper presents an implementation of the […]
Oct, 2
Fast Estimation of Gaussian Mixture Model Parameters on GPU using CUDA
Gaussian Mixture Models (GMMs) are widely used among scientists e.g. in statistics toolkits and data mining procedures. In order to estimate parameters of a GMM the Maximum Likelihood (ML) training is often utilized, more precisely the Expectation-Maximization (EM) algorithm. Nowadays, a lot of tasks works with huge datasets, what makes the estimation process time consuming […]
Sep, 30
Parallel QuadTree Encoding of Large-Scale Raster Geospatial Data on Multicore CPUs and GPGPUs
Global remote sensing and large-scale environment modeling have generated vast amounts of raster geospatial images. To gain a better understanding of this data, researchers are interested in performing spatial queries over them, and the computation of those queries’ results is greatly facilitated by the existence of spatial indices. Additionally, though there have been major advances […]
Sep, 30
GPU-Accelerated Bayesian Learning and Forecasting in Simultaneous Graphical Dynamic Linear Models
We discuss modeling and GPU-based computation in a new class of multivariate dynamic models customized to learning and prediction with increasingly high-dimensional time series. This defines an approach to decoupling analysis into a parallel set of univariate time series dynamic models, while flexibly modeling cross-series relationships in a novel, induced class of time-varying graphical models […]
Sep, 30
Harnessing GPU Computing in System-Level Software
As the base of the software stack, system-level software is expected to provide efficient and scalable storage, communication, security and resource management functionalities. However, there are many computationally expensive functionalities at the system level, such as encryption, packet inspection, and error correction. All of these require substantial computing power. What’s more, today’s application workloads have […]
Sep, 30
Implementation of a Multi-User Detector for Satellite Return Links on a GPU Platform
Due to the scarcity and high cost of satellite frequency spectrum, it is very important to utilize the available spectrum as efficiently as possible. The efficient usage of the spectrum in the satellite return link is a challenging task, especially if multiple users are present. In previous works Multi-User Detection (MUD) techniques have been widely […]
Sep, 30
A (Somewhat Dated) Comparative Study of Betweenness Centrality Algorithms on GPU
The problem of computing the Betweenness Centrality (BC) is important in analyzing graphs in many practical applications like social networks, biological networks, transportation networks, electrical circuits, etc. Since this problem is computation intensive, researchers have been developing algorithms using high performance computing resources like supercomputers, clusters, and Graphics Processing Units (GPUs). Current GPU algorithms for […]
Sep, 29
MEGAHIT: An ultra-fast single-node solution for large and complex metagenomics assembly via succinct de Bruijn graph
MEGAHIT is a NGS de novo assembler for assembling large and complex metagenomics data in a time- and cost-efficient manner. It finished assembling a soil metagenomics dataset with 252Gbps in 44.1 hours and 99.6 hours on a single computing node with and without a GPU, respectively. MEGAHIT assembles the data as a whole, i.e., it […]
Sep, 29
A GPU-based Algorithm-specific Optimization for High-performance Background Subtraction
Background subtraction is an essential first stage in many vision applications differentiating foreground pixels from the background scene, with Mixture of Gaussians (MoG) being a widely used implementation choice. MoG’s high computation demand renders a real-time single threaded realization infeasible. With it’s pixel level parallelism, deploying MoG on top of parallel architectures such as a […]
Sep, 29
Enabling Efficient Use of MPI and PGAS Programming Models on Heterogeneous Clusters with High Performance Interconnects
Accelerators (such as NVIDIA GPUs) and coprocessors (such as Intel MIC/Xeon Phi) are fueling the growth of next-generation ultra-scale systems that have high compute density and high performance per watt. However, these many-core architectures cause systems to be heterogeneous by introducing multiple levels of parallelism and varying computation/communication costs at each level. Application developers also […]
Sep, 29
Decoupling algorithms from the organization of computation for high performance image processing
Future graphics and imaging applications-from self-driving cards, to 4D light field cameras, to pervasive sensing-demand orders of magnitude more computation than we currently have. This thesis argues that the efficiency and performance of an application are determined not only by the algorithm and the hardware architecture on which it runs, but critically also by the […]