Posts
Aug, 9
Kargus: a Highly-scalable Software-based Intrusion Detection System
As high-speed networks are becoming commonplace, it is increasingly challenging to prevent the attack attempts at the edge of the Internet. While many high-performance intrusion detection systems (IDSes) employ dedicated network processors or special memory to meet the demanding performance requirements, it often increases the cost and limits functional flexibility. In contrast, existing softwarebased IDS […]
Aug, 9
Optimizing Data Warehousing Applications for GPUs Using Kernel Fusion/Fission
Data warehousing applications represent an emergent application arena that requires the processing of relational queries and computations over massive amounts of data. Modern general purpose GPUs are high core count architectures that potentially offer substantial improvements in throughput for these applications. However, there are significant challenges that arise due to the overheads of data movement […]
Aug, 8
Solving the Flexible Job Shop Problem on Multi-GPU
We propose the new framework of the distributed tabu search metaheuristic designed to be executed using a multi-GPU cluster, i.e. cluster of nodes equipped with GPU computing units. We propose a hybrid single-walk parallelization of the tabu search, where hybridization consists in examining a number of solutions from a neighborhood concurrently by several GPUs (multi-GPU). […]
Aug, 8
CUDA-Accelerated HD-ODETLAP: Lossy High Dimensional Gridded Data Compression
We present High-dimensional Overdetermined Laplacian Partial Differential Equations (HD-ODETLAP), a high dimensional lossy compression algorithm and CUDA implementation that exploits data correlations across multiple dimensions of gridded GIS data. Exploiting the GPU gives a considerable speedup. In addition, HD-ODETLAP compresses much better than JPEG2000 and 3D-SPIHT, when fixing either the average or the maximum error.
Aug, 8
Policy-based Tuning for Performance Portability and Library Co-optimization
Although modular programming is a fundamental software development practice, software reuse within contemporary GPU kernels is uncommon. For GPU software assets to be reusable across problem instances, they must be inherently flexible and tunable. To illustrate, we survey the performance-portability landscape for a suite of common GPU primitives, evaluating thousands of reasonable program variants across […]
Aug, 8
Large Scale Finite Element Analysis Using GPU Parallel Computing
In the past years, graphic processing units have become a new abundant parallelcomputing resource on personal computers. In this work parallel computation ofa typical case in nite element analysis for solids has been practiced. The solutionof 3-D linear elastic static problems with 3 degree of freedom is fully implementedutilizing the current GPU technology. Discretization of […]
Aug, 8
Using GPU-based Computing To Accelerate Finite Element Problems
Historically Graphics Processing Units (GPU) have been used for offloading graphical visualization and made popular in use for video games, but with the development of NVIDIA’s CUDA architecture and programing language there has been an increase in the use of GPUs in general purpose (GPGPU) programing. Problems involving large systems of linear equations, such as […]
Aug, 7
Efficient Algorithms for Sorting on GPUs
Sorting is an important problem in computing that has a rich history of investigation by various researchers. In this thesis we focus on this vital problem. In particular, we develop a novel algorithm for sorting on Graphics Processing Units (GPUs). GPUs are multicore architectures that offer the potential of affordable parallelism. We present an efficient […]
Aug, 7
Efficient Monte Carlo sampler for detecting parametric objects in large scenes
Point processes have demonstrated efficiency and competitiveness when addressing object recognition problems in vision. However, simulating these mathematical models is a difficult task, especially on large scenes. Existing samplers suffer from average performances in terms of computation time and stability. We propose a new sampling procedure based on a Monte Carlo formalism. Our algorithm exploits […]
Aug, 7
Landau Gauge Fixing on GPUs and String Tension
We explore the performance of CUDA in performing Landau gauge fixing in Lattice SU(3), using the steepest descent method with Fourier acceleration. The code performance was tested in a Tesla C2070, Fermi architecture. We also present a study of the string tension at finite temperature in the confined phase. The string tension is extracted from […]
Aug, 7
CuBA – a CUDA implementation of BAMPS
Using CUDA as programming language, we create a code named CuBA which is based on the CPU code "Boltzmann Approach for Many Parton Scattering (BAMPS)" developed in Frankfurt in order to study a system of many colliding particles resulting from heavy ion collisions. Furthermore, we benchmark our code with the Riemann Problem and compare the […]
Aug, 7
Swarm-NG: a CUDA Library for Parallel n-body Integrations with focus on Simulations of Planetary Systems
We present Swarm-NG, a C++ library for the efficient direct integration of many n-body systems using highly-parallel Graphics Processing Unit (GPU), such as NVIDIA’s Tesla T10 and M2070 GPUs. While previous studies have demonstrated the benefit of GPUs for n-body simulations with thousands to millions of bodies, Swarm-NG focuses on many few-body systems, e.g., thousands […]