10926

Posts

Nov, 14

On the origin of yet another channel

Cryptanalysis of a cryptographic function like stream, block or hash function usually requires human cryptanalytical skills and labour. However, some automation is possible – e.g., by randomness testing suites like NIST/Diehard that can be applied to test statistical properties of cryptographic function outputs. Yet such testing suites are limited only to predefined statistical functions.We propose […]
Nov, 14

A finite volume approach for the simulation of nonlinear dissipative acoustic wave propagation

A form of the conservation equations for fluid dynamics is presented, deduced using slightly less restrictive hypothesis than those necessary to obtain the well known Westervelt equation. This formulation accounts for full wave diffraction, nonlinearity, and thermoviscous dissipative effects. A CLAWPACK based, 2D finite volume method using the Roe linearization was implemented to obtain numerically […]
Nov, 13

Designing Scientific Applications on GPUs

Many of today’s complex scientific applications now require a vast amount of computational power. General purpose graphics processing units (GPGPUs) enable researchers in a variety of fields to benefit from the computational power of all the cores available inside graphics cards. Understand the Benefits of Using GPUs for Many Scientific Applications: Designing Scientific Applications on […]
Nov, 13

Anatomy of High-Performance Many-Threaded Matrix Multiplication

BLIS is a new framework for rapid instantiation of the BLAS. We describe how BLIS extends the "GotoBLAS approach" to implementing matrix multiplication (GEMM). While GEMM was previously implemented as three loops around an inner kernel, BLIS exposes two additional loops within that inner kernel, casting the computation in terms of the BLIS microkernel so […]
Nov, 13

Utilizing massive parallelism in decoding of modern error-correcting codes for accelerating communication systems simulations

In this paper a novel approximate algorithm for massively-parallel decoding of trellis based error correcting codes (ECC) is presented. The potential effect of using such optimized decoder on acceleration of simulations of modern communication systems implementing the most recent communication standards, such as LTE-A (Long Term Evolution – Advanced) is evaluated quantitatively by presenting an […]
Nov, 13

GPU Enhancement of the Trigger to Extend Physics Reach at the Large Hadron Collider

At the Large Hadron Collider (LHC), the trigger systems for the detectors must be able to process a very large amount of data in a very limited amount of time, so that the nominal collision rate of 40 MHz can be reduced to a data rate that can be stored and processed in a reasonable […]
Nov, 13

Lattice Simulations using OpenACC compilers

OpenACC compilers allow one to use Graphics Processing Units without having to write explicit CUDA codes. Programs can be modified incrementally using OpenMP like directives which causes the compiler to generate CUDA kernels to be run on the GPUs. In this article we look at the performance gain in lattice simulations with dynamical fermions using […]
Nov, 13

Indexing million of packets per second using GPUs

Network traffic recorders are devices that record massive volumes of network traffic for security applications, like retrospective forensic investigations. When deployed over very high-speed networks, traffic recorders must process and store millions of packets per second. To enable interactive explorations of such large traffic archives, packet indexing mechanisms are required. Indexing packets at wire rates […]
Nov, 12

Sailfish: a flexible multi-GPU implementation of the lattice Boltzmann method

We present Sailfish, an open source fluid simulation package implementing the lattice Boltzmann method (LBM) on modern Graphics Processing Units (GPUs) using CUDA/OpenCL. We take a novel approach to GPU code implementation and use run-time code generation techniques and a high level programming language (Python) to achieve state of the art performance, while allowing easy […]
Nov, 12

High speed cipher cracking: the case of Keeloq on CUDA

Graphic Processing Units (GPU) are increasingly popular in the field of high-performance computing for their ability to provide computational power for massively parallel problems at a reduced cost. However, the programming model exposed by the GPGPU software development tools is often insufficient to achieve full performance, and a major rethinking of algorithmic choices is needed. […]
Nov, 12

A Hybrid GPU/CPU FFT Library for Large FFT Problems

Graphic Processing Units (GPU) has been proved to be a promising platform to accelerate large size Fast Fourier Transform (FFT) computation. However, current GPU-based FFT implementation only uses GPU to compute, but employs CPU as a mere memory-transfer controller. The computation power in today’s high-performance CPU is wasted. In this project, a hybrid optimization framework […]
Nov, 12

Performance Evaluation of R with Intel Xeon Phi Coprocessor

Over the years, R has been adopted as a major data analysis and mining tool in many domain fields. As Big Data overwhelms those fields, the computational needs and workload of existing R solutions increases significantly. With recent hardware and software developments, it is possible to enable massive parallelism with existing R solutions with little […]

* * *

* * *

HGPU group © 2010-2025 hgpu.org

All rights belong to the respective authors

Contact us:

contact@hpgu.org