Posts
Jun, 5
Tackling Exascale Software Challenges in Molecular Dynamics Simulations with GROMACS
GROMACS is a widely used package for biomolecular simulation, and over the last two decades it has evolved from small-scale efficiency to advanced heterogeneous acceleration and multi-level parallelism targeting some of the largest supercomputers in the world. Here, we describe some of the ways we have been able to realize this through the use of […]
Jun, 5
Fast algorithms and efficient GPU implementations for the Radon transform and the back-projection operator represented as convolution operators
The Radon transform and its adjoint, the back-projection operator, can both be expressed as convolutions in log-polar coordinates. Hence, fast algorithms for the application of the operators can be constructed by using FFT, if data is resampled at log-polar coordinates. Radon data is typically measured on an equally spaced grid in polar coordinates, and reconstructions […]
Jun, 5
7th International Conference on Signal Processing Systems (ICSPS), 2015
Topics: Adaptive Filtering & Signal Processing Ad-Hoc and Sensor Networks Analog and Mixed Signal Processing Array Signal Processing Audio and Electroacoustics Audio/Speech Processing and Coding Bioimaging and Signal Processing Biometrics & Authentification Biosignal Processing & Understanding Communication and Broadband Networks Communication Signal processing Computer Vision & Virtual Reality Cryptography and Network Security Design and Implementation […]
Jun, 3
A Survey of Software Techniques for Using Non-Volatile Memories for Storage and Main Memory Systems
Non-volatile memory (NVM) devices, such as Flash, phase change RAM, spin transfer torque RAM, and resistive RAM, offer several advantages and challenges when compared to conventional memory technologies, such as DRAM and magnetic hard disk drives (HDDs). In this paper, we present a survey of software techniques that have been proposed to exploit the advantages […]
Jun, 1
Genetically Improved BarraCUDA
BarraCUDA is a C program which uses the BWA algorithm in parallel with nVidia CUDA to align short next generation DNA sequences against a reference genome. The genetically improved (GI) code is up to three times faster on short paired end reads from The 1000 Genomes Project and 60 percent more accurate on a short […]
Jun, 1
Research on the fast Fourier transform of image based on GPU
Study of general purpose computation by GPU (Graphics Processing Unit) can improve the image processing capability of micro-computer system. This paper studies the parallelism of the different stages of decimation in time radix 2 FFT algorithm, designs the butterfly and scramble kernels and implements 2D FFT on GPU. The experiment result demonstrates the validity and […]
Jun, 1
A Parallel Cellular Automaton Simulation Framework using CUDA
In the current digital age, the use of cellular automata to simulate natural systems has grown more popular as our understanding of cellular systems increases. Up until about a decade ago, digital models based on the concept of cellular automata have primarily been simulated with sequential rule application algorithms, which do not exploit the inherent […]
Jun, 1
Quantum Chemistry for Solvated Molecules on Graphical Processing Units (GPUs) using Polarizable Continuum Models
The conductor-like polarization model (C-PCM) with switching/Gaussian smooth discretization is a widely used implicit solvation model in chemical simulations. However, its application in quantum mechanical calculations of large-scale biomolecular systems can be limited by computational expense of both the gas phase electronic structure and the solvation interaction. We have previously used graphical processing units (GPUs) […]
Jun, 1
Efficient FFT mapping on GPU for radar processing application: modeling and implementation
General-purpose multiprocessors (as, in our case, Intel IvyBridge and Intel Haswell) increasingly add GPU computing power to the former multicore architectures. When used for embedded applications (for us, Synthetic aperture radar) with intensive signal processing requirements, they must constantly compute convolution algorithms, such as the famous Fast Fourier Transform. Due to its "fractal" nature (the […]
May, 29
Optimized Password Recovery for Encrypted RAR on GPUs
RAR uses classic symmetric encryption algorithm SHA-1 hashing and AES algorithm for encryption, and the only method of password recovery is brute force, which is very time-consuming. In this paper, we present an approach using GPUs to speed up the password recovery process. However, because the major calculation and time-consuming part, SHA-1 hashing, is hard […]
May, 29
Large-scale network simulation over heterogeneous computing architecture
The simulation is a primary step on the evaluation process of modern networked systems. The scalability and efficiency of such a tool in view of increasing complexity of the emerging networks is a key to derive valuable results. The discrete event simulation is recognized as the most scalable model that copes with both parallel and […]
May, 29
Design and Optimization of OpenFOAM-based CFD Applications for Hybrid and Heterogeneous HPC Platforms
Hardware-aware design and optimization is crucial in exploiting emerging architectures for PDE-based computational fluid dynamics applications. In this work, we study optimizations aimed at acceleration of OpenFOAM-based applications on emerging hybrid heterogeneous platforms. OpenFOAM uses MPI to provide parallel multi-processor functionality, which scales well on homogeneous systems but does not fully utilize the potential per-node […]

