Posts
Oct, 29
Parallel training of Deep Neural Networks with Natural Gradient and Parameter Averaging
We describe the neural-network training framework used in the Kaldi speech recognition toolkit, which is geared towards training DNNs with large amounts of training data using multiple GPU-equipped or multi-core machines. In order to be as hardware-agnostic as possible, we needed a way to use multiple machines without generating excessive network traffic. Our method is […]
Oct, 27
Testing and Exposing Weak Graphics Processing Unit Memory Models
Graphics Processing Units (GPUs) are highly parallel shared memory microprocessors, and as such, they are prone to the same concurrency considerations as their traditional multicore CPU counterparts. In this thesis, we consider shared memory consistency, i.e. what values can be read when issued concurrently with writes on current GPU hardware. While memory consistency has been […]
Oct, 27
Parallel Finite Volume Algorithm on Graphic Processing Units (GPU)
Capabilities of using Graphic Processing Units (GPU) as a computational tool in CFD have been investigated here. Several solvers for solving linear matrix equations have been benchmarked on GPU and is shown that Gauss-Seidle gives the best performance for the GPU architecture. Compared to CPU on a case of lid-driven cavity flow, speedups of up […]
Oct, 27
Bayesian Neural Networks in Data-Intensive High Energy Physics Applications
This dissertation studies a graphical processing unit (GPU) construction of Bayesian neural networks (BNNs) using large training data sets. The goal is to create a program for the mapping of phenomenological Minimal Supersymmetric Standard Model (pMSSM) parameters to their predictions. This would allow for a more robust method of studying the Minimal Supersymmetric Standard Model, […]
Oct, 27
Finding Longest Common Subsequences by GPU-Based Parallel Ant Colony Optimization
The longest common subsequence (LCS) problem is one of the classic problems in string processing. It is commonly used in file comparison, pattern recognition, and computational biology as a measure of sequence similarity. Given a set of strings, the LCS is the longest string that is a subsequence of every string in the set. For […]
Oct, 27
Contract-Based General-Purpose GPU Programming
Using GPUs as general-purpose processors has revolutionized parallel computing by offering, for a large and growing set of algorithms, massive data-parallelization on desktop machines. As an obstacle to widespread adoption, programming GPUs has remained difficult due to the need of using low-level control of the hardware to achieve good performance. This paper suggests a programming […]
Oct, 25
GPGPU Acceleration for Skeletal Animation-comparing OpenCL with CUDA and GLSL
The existing matrix palette algorithms for skeletal animation are accelerated by the technique GPGPU based on GLSL or CUDA. Because GLSL is extended from graphics library OpenGL, it couples the rendering and calculations together closely and forces itself not convenient to reuse, meanwhile CUDA is designed only for NVIDIA GPUs. In this paper GPGPU based […]
Oct, 25
Evacuation Route Modeling and Planning with General Purpose GPU Computing
This work introduces a bilevel, stochastic optimization problem aimed at robust, regional evacuation network design and shelter location under uncertain hazards. A regional planner, acting as a Stackelberg leader, chooses among evacuation-route contraflow operation and shelter location to minimize the expected risk exposure to evacuees. Evacuees then seek an equilibrium with respect to risk exposure […]
Oct, 25
On the Efficiency of CPU and Hybrid CPU-GPU Systems in Computational Biology Tasks
The complexity and diversity of the computational biology tasks requires a deliberate approach to the computational resource management. We have analyzed the performance of the common CPU and hybrid CPU-GPU hardware configurations in molecular dynamics and homology modeling tasks. Our results show that on dual-processor nodes it is in overall more efficient to execute two […]
Oct, 25
Medical imaging using CUDA
As multiple sclerosis is known to cause atrophy and deformation in the brain, it also influences the shape and size of the corpus callosum. Longitudinal studies try to quantify these changes using medical image analysis techniques for measuring and analyzing the shape and size of a corpus callosum cross-sechtion embedded in a specially selected measurement […]
Oct, 25
CUVLE: Variable-Length Encoding on CUDA
Data compression is the process of representing information in a compact form, in order to reduce the storage requirements and, hence, communication bandwidth. It has been one of the critical enabling technologies for the ongoing digital multimedia revolution for decades. In the variable-length encoding (VLE) compression method, most frequently occurring symbols are replaced by codes […]
Oct, 25
5th International Conference on Computer Communication and Management, ICCCM 2015
Submission Deadline: 2015-03-01 Publication: Conference papers can be selected and published into International Journal of Computer and Communication Engineering (IJCCE) or Journal of Advanced Management Science(JOAMS) excellent papers will be select to be published in International Journal of e-Education, e-Business, e-Management and e-Learning(IJEEEE) Topic: A. Computing • Parallel and Distributing Computing • High-Performance Computing • […]