Posts
Apr, 6
Are Very Deep Neural Networks Feasible on Mobile Devices?
In the recent years, the computing power of mobile devices has increased tremendously, a trend that is expected to continue in the future. With high-quality onboard cameras, these devices are capable of collecting large volumes of visual information. Motivated by the observation that processing this video on the mobile device can enable many new applications, […]
Apr, 6
GPU-accelerated stochastic predictive control of drinking water networks
Despite the proven advantages of scenario-based stochastic model predictive control for the operational control of water networks, its applicability is limited by its considerable computational footprint. In this paper we fully exploit the structure of these problems and solve them using a proximal gradient algorithm parallelizing the involved operations. The proposed methodology is applied and […]
Apr, 3
Implementation of the SYCL Heterogeneous Computing Library
Heterogeneous computing is becoming more popular with the lack of CPU performance increases, the exceptional rate of GPU performance growth, and the emergence of other programmable computing elements. However, programming heterogeneous systems is still problematic due to differing hardware, explicit data copying, and synchronization. The SYCL specification aims to simplify heterogeneous programming by building on […]
Apr, 3
sPEGG: high throughput eco-evolutionary simulations on commodity graphics processors
Integrating population genetics into community ecology theory is a major goal in ecology and evolution, but analyzing the resulting models is computationally daunting. Here we describe sPEGG (simulating Phenotypic Evolution on General Purpose Graphics Processing Units (GPGPUs)), an open-source, multi-species forward-time population genetics simulator. Using a single commodity GPGPU instead of a single central processor, […]
Apr, 3
Classiffication-based Financial Markets Prediction using Deep Neural Networks
Deep neural networks (DNNs) are powerful types of artificial neural networks (ANNs) that use several hidden layers. They have recently gained considerable attention in the speech transcription and image recognition community (Krizhevsky et al., 2012) for their superior predictive properties including robustness to overfitting. However their application to algorithmic trading has not been previously researched, […]
Apr, 3
Dynamic Sparse-Matrix Allocation on GPUs
Sparse matrices are a core component in many numerical simulations, and their efficiency is essential to achieving high performance. Dynamic sparse-matrix allocation (insertion) can benefit a number of problems such as sparse-matrix factorization, sparse-matrix-matrix addition, static analysis (e.g. points-to analysis), computing transitive closure, and other graph algorithms. Existing sparse-matrix formats are poorly designed to handle […]
Apr, 3
Evaluating the Performance Impact of Multiple Streams on the MIC-based Heterogeneous Platform
Using multiple streams can improve the overall system performance by mitigating the data transfer overhead on heterogeneous systems. Prior work focuses a lot on GPUs but little is known about the performance impact on (Intel Xeon) Phi. In this work, we apply multiple streams into six real-world applications on Phi. We then systematically evaluate the […]
Mar, 29
A Novel CSR-Based Sparse Matrix-Vector Multiplication on GPUs
Sparse matrix-vector multiplication (SpMV) is an important operation in scientific computations. Compressed sparse row (CSR) is the most frequently used format to store sparse matrices. However, CSR-based SpMVs on graphic processing units (GPUs), e.g., CSR-scalar and CSR-vector, usually have poor performance due to irregular memory access patterns. This motivates us to propose a perfect CSR-based […]
Mar, 29
A generalized GPU-based connected component labeling algorithm
We propose a generalized GPU-based connected component labeling (CCL) algorithm that can be applied to both various lattices and to non-lattice environments in a uniform fashion. We extend our recent GPU-based CCL algorithm without the use of conventional iteration to the generalized method. As an application of this algorithm, we deal with the bond percolation […]
Mar, 29
Generic Inverted Index on the GPU
Data variety, as one of the three Vs of the Big Data, is manifested by a growing number of complex data types such as documents, sequences, trees, graphs and high dimensional vectors. To perform similarity search on these data, existing works mainly choose to create customized indexes for different data types. Due to the diversity […]
Mar, 29
A Stencil DSEL for Single Code Accelerated Computing with SYCL
Stencil kernels arise in many scientific codes as the result from dis-cretizing natural, continuous phenomenons. Many research works have designed stencil frameworks to help programmer optimize stencil kernels for performance, and to target CPUs or accelerators. However, existing stencil kernels, either library-based or language-based necessitate to write distinct source codes for accelerated kernels and for […]
Mar, 29
GPU Computing in Bayesian Inference of Realized Stochastic Volatility Model
The realized stochastic volatility (RSV) model that utilizes the realized volatility as additional information has been proposed to infer volatility of financial time series. We consider the Bayesian inference of the RSV model by the Hybrid Monte Carlo (HMC) algorithm. The HMC algorithm can be parallelized and thus performed on the GPU for speedup. The […]