8117

Posts

Aug, 10

Block-Relaxation Methods for 3D Constant-Coefficient Stencils on GPUs and Multicore CPUs

Block iterative methods are extremely important as smoothers for multigrid methods, as preconditioners for Krylov methods, and as solvers for diagonally dominant linear systems. Developing robust and efficient algorithms suitable for current and evolving GPU and multicore CPU systems is a significant challenge. We address this issue in the case of constant-coefficient stencils arising in […]
Aug, 9

Massive parallelization of serial inference algorithms for a complex generalized linear model

Following a series of high-profile drug safety disasters in recent years, many countries are redoubling their efforts to ensure the safety of licensed medical products. Large-scale observational databases such as claims databases or electronic health record systems are attracting particular attention in this regard, but present significant methodological and computational concerns. In this paper we […]
Aug, 9

CPU-GPU Algorithms for Triangular Surface Mesh Simplification

Mesh simplification and mesh compression are important processes in computer graphics and scientific computing, as such contexts allow for a mesh which takes up less memory than the original mesh. Current simplification and compression algorithms do not take advantage of both the central processing unit (CPU) and the graphics processing unit (GPU). We propose three […]
Aug, 9

OpenCL-based Algorithm for Heat Load Modelling of District Heating System

This paper presents a parallel approach to estimate the parameters in the heat loading of a district heating system by use of the traditional particle swarm optimisation (TPSO) on the Graphic Processing Unit (GPU) using OpenCL. The running time of the algorithm is greatly reduced compared to running on CPU. The heat load is approximated […]
Aug, 9

Kargus: a Highly-scalable Software-based Intrusion Detection System

As high-speed networks are becoming commonplace, it is increasingly challenging to prevent the attack attempts at the edge of the Internet. While many high-performance intrusion detection systems (IDSes) employ dedicated network processors or special memory to meet the demanding performance requirements, it often increases the cost and limits functional flexibility. In contrast, existing softwarebased IDS […]
Aug, 9

Optimizing Data Warehousing Applications for GPUs Using Kernel Fusion/Fission

Data warehousing applications represent an emergent application arena that requires the processing of relational queries and computations over massive amounts of data. Modern general purpose GPUs are high core count architectures that potentially offer substantial improvements in throughput for these applications. However, there are significant challenges that arise due to the overheads of data movement […]
Aug, 8

Solving the Flexible Job Shop Problem on Multi-GPU

We propose the new framework of the distributed tabu search metaheuristic designed to be executed using a multi-GPU cluster, i.e. cluster of nodes equipped with GPU computing units. We propose a hybrid single-walk parallelization of the tabu search, where hybridization consists in examining a number of solutions from a neighborhood concurrently by several GPUs (multi-GPU). […]
Aug, 8

CUDA-Accelerated HD-ODETLAP: Lossy High Dimensional Gridded Data Compression

We present High-dimensional Overdetermined Laplacian Partial Differential Equations (HD-ODETLAP), a high dimensional lossy compression algorithm and CUDA implementation that exploits data correlations across multiple dimensions of gridded GIS data. Exploiting the GPU gives a considerable speedup. In addition, HD-ODETLAP compresses much better than JPEG2000 and 3D-SPIHT, when fixing either the average or the maximum error.
Aug, 8

Policy-based Tuning for Performance Portability and Library Co-optimization

Although modular programming is a fundamental software development practice, software reuse within contemporary GPU kernels is uncommon. For GPU software assets to be reusable across problem instances, they must be inherently flexible and tunable. To illustrate, we survey the performance-portability landscape for a suite of common GPU primitives, evaluating thousands of reasonable program variants across […]
Aug, 8

Large Scale Finite Element Analysis Using GPU Parallel Computing

In the past years, graphic processing units have become a new abundant parallelcomputing resource on personal computers. In this work parallel computation ofa typical case in nite element analysis for solids has been practiced. The solutionof 3-D linear elastic static problems with 3 degree of freedom is fully implementedutilizing the current GPU technology. Discretization of […]
Aug, 8

Using GPU-based Computing To Accelerate Finite Element Problems

Historically Graphics Processing Units (GPU) have been used for offloading graphical visualization and made popular in use for video games, but with the development of NVIDIA’s CUDA architecture and programing language there has been an increase in the use of GPUs in general purpose (GPGPU) programing. Problems involving large systems of linear equations, such as […]
Aug, 7

Efficient Algorithms for Sorting on GPUs

Sorting is an important problem in computing that has a rich history of investigation by various researchers. In this thesis we focus on this vital problem. In particular, we develop a novel algorithm for sorting on Graphics Processing Units (GPUs). GPUs are multicore architectures that offer the potential of affordable parallelism. We present an efficient […]

* * *

* * *

HGPU group © 2010-2024 hgpu.org

All rights belong to the respective authors

Contact us: