Posts
Feb, 5
Vortex Methods for Fluid Simulation in Computer Graphics
Fluid simulations for computer graphics applications have attracted the attention of many researchers and practitioners due to the enhanced realism that natural phenomena simulation adds to graphical applications. Vortex methods are receiving increasing attention from the computer graphics community for simple and direct modeling of complex flow phenomena such as turbulence. However, vortex methods have […]
Feb, 5
A Comparison of CPU and OpenCL Parallelization Methods for Correlation and Graph Layout Algorithms used in the Network Analysis of High Dimensional Data
MOTIVATION: Many algorithms used in analysis of high dimensional data require significant processing time due to the sheer number of values compared. We describe the results of the parallelization of two algorithms central to the functionality of the network analysis tool BioLayout Express 3D; the calculation of correlation (Pearson, Spearman Rank) coefficient matrices used to […]
Feb, 5
Real-Time Phase Masks for Interactive Stimulation of Optogenetic Neurons
Experiments with networks of optogenetically altered neurons require stimulation with high spatio-temporal selectivity. Computer-assisted holography is an energy-efficient method for robust and reliable addressing of single neurons on the millisecond-timescale inherent to biologial information processing. We show that real-time control of neurons can be achieved by a CUDA-based hologram computation.
Feb, 3
Parallelization of the QR Decomposition with Column Pivoting Using Column Cyclic Distribution on Multicore and GPU Processors
The QR decomposition with column pivoting (QRP) of a matrix is widely used for numerical rank revealing in applications. The performance of LAPACK implementation (DGEQP3) of the Householder QRP algorithm is limited by Level 2 BLAS operations required for updating the column norms. In this paper, we propose an implementation of the QRP algorithm using […]
Feb, 3
Hybrid CPU-GPU Distributed Framework for Large Scale Mobile Networks Simulation
Most of the existing packet-level simulation tools are designed to perform experiments modeling a small to medium scale networks. The main reason of this limitation is the amount of available computation power and memory in quasi mono-process simulation environment. To enable efficient packet-level simulation for large scale scenario, we introduce a new CPUGPU co-simulation framework […]
Feb, 3
JPEG 2000 Wireless Image Transmission System using Encryption Domain Authentication
In this paper, we propose a wireless high resolution video transmission system with encryption and authentication. The proposed system is implemented by JPEG 2000 coding. We implement JPEG 2000 coder by GPU in CUDA which is an integrated development environment for GPU, or by JPEG 2000 codec LSI. Moreover, the authentication system can check the […]
Feb, 3
Fast and Maliciously Secure Two-Party Computation Using the GPU
We describe, and implement, a maliciously secure protocol for secure two-party computation, based on Yao’s garbled circuit and an efficient OT extension, in a parallel computational model. The implementation is done using CUDA and yields the fastest results for maliciously secure two-party computation in a realistic and practical setting by using a simple consumer grade […]
Feb, 2
Software Reliability Enhancements for GPU Applications
As the role of highly-parallel accelerators becomes more important in high performance computing, so does the need to ensure their reliable operation. In applications where precision and correctness is a necessity, bit-level reliable operation is required. While there exist mechanisms for error detection and correction, the cost-effective implementation in massively parallel accelerators is still an […]
Feb, 2
Portable Performance on Heterogeneous Architectures
Trends in both consumer and high performance computing are bringing not only more cores, but also increased heterogeneity among the computational resources within a single machine. In many machines, one of the greatest computational resources is now their graphics coprocessors (GPUs), not just their primary CPUs. But GPU programming and memory models differ dramatically from […]
Feb, 2
Heterogeneous GPU and CPU acceleration of a finite volume compressible flow solver for multiblock structured grids
The main objective of this project is to investigate the applications of heterogeneous acceleration to finite volume compressible flow solver for multiblock structured grids. Provided as Fortran source code, the ROTORMBMGS computational fluid dynamics program currently uses domain decomposition and message passing to distribute computation across multiple computers. Winning awards for scaling performance, there is […]
Feb, 2
Improving GPGPU Concurrency with Elastic Kernels
Each new generation of GPUs vastly increases the resources available to GPGPU programs. GPU programming models (like CUDA) were designed to scale to use these resources. However, we find that CUDA programs actually do not scale to utilize all available resources, with over 30% of resources going unused on average for programs of the Parboil2 […]
Feb, 2
XKaapi: A Runtime System for Data-Flow Task Programming on Heterogeneous Architectures
Most recent HPC platforms have heterogeneous nodes composed of multi-core CPUs and accelerators, like GPUs. Programming such nodes is typically based on a combination of OpenMP and CUDA/OpenCL codes; scheduling relies on a static partitioning and cost model. We present the XKaapi runtime system for data-flow task programming on multi-CPU and multi-GPU architectures, which supports […]