9654

Posts

Jun, 9

Understanding Dynamic Parallelism at Any Scale with Allinea’s Unified Tools (webinar)

Dynamic Parallelism is a great new feature introduced by NVIDIA in CUDA 5. As powerful features like this are introduced, the complexity of debugging and profiling often increase. This webinar will provide technical insight into how Allinea’s powerful tools can save the day if bugs come up when developing with Dynamic Parallelism. The webinar, presented […]
Jun, 8

GPU Acceleration of Particle Advection Workloads in a Parallel, Distributed Memory Setting

Although there has been significant research in GPU acceleration, both of parallel simulation codes (i.e., GPGPU) and of single GPU visualization and analysis algorithms, there has been relatively little research devoted to visualization and analysis algorithms on GPU clusters. This oversight is significant: parallel visualization and analysis algorithms have markedly different characteristics – computational load, […]
Jun, 8

High Resolution Sparse Voxel DAGs

We show that a binary voxel grid can be represented orders of magnitude more efficiently than using a sparse voxel octree (SVO) by generalising the tree to a directed acyclic graph (DAG). While the SVO allows for efficient encoding of empty regions of space, the DAG additionally allows for efficient encoding of identical regions of […]
Jun, 8

Efficient Parallel Proximity Queries and an Application to Highly Complex Motion Planning Problems with Many Narrow Passages

In industrial manufacturing, like the automotive industry, digital mock-ups are used to design complex machinery with the help of computer systems. In this field, motion planning algorithms play an important role to ensure the (de-)composability of the digital prototypes. In the last decades, sampling-based motion planning algorithms have shown themselves to be practical in this […]
Jun, 8

Accelerated Dynamic Programming on GPU: A Study of Speed Up and Programming Approach

GPUs (Graphics processing units) can be used for general purpose parallel computation. Developers can develop parallel programs running on GPUs using different computing architectures like CUDA or OpenCL. The Optimal Matrix Chain Multiplication problem is an optimization problem to find the optimal order for multiplying a chain of matrices. The optimal order of multiplication depends […]
Jun, 8

Modernizing the core quantum chemistry algorithms

This document covers the basics of computational chemistry and how using the modern programming techniques the theory can be efficiently implemented on digital computers. The computer implementations are developed from the core two-electron integrals to many-body and coupled cluster algorithms. A particular attention is paid to the physical constraints of he computer resources and the […]
Jun, 7

How a Single Chip Causes Massive Power Bills. GPUSimPow: A GPGPU Power Simulator

Modern GPUs are true power houses in every meaning of the word: While they offer general-purpose (GPGPU) compute performance an order of magnitude higher than that of conventional CPUs, they have also been rapidly approaching the infamous "power wall", as a single chip sometimes consumes more than 300W. Thus, the design space of GPGPU microarchitecture […]
Jun, 7

Genetic Programming using the Karva Gene Expression Language on Graphical Processing Units

Genetic Programming (GP) has been employed in many problem domains, and as a result, it has been the subject of much scientific inquiry. The extensive literature body of GP has reported applications in algorithm discovery, image enhancement and cooperative multi-agent systems, as well as many other areas and disciplines, such as agent-based modelling in Geography […]
Jun, 7

Parallel Dynamic Solidification Model of Continuous Steel Casting on GPU

Nowadays, dynamic solidification models of continuously cast steel are commonly used in steelworks over the world to control the casting process and to monitor the steel production. Moreover, these models of transient temperature field can also be utilized for optimization of continuous casting, its on-line regulation, or may help operators to solve non-standard or breakdown […]
Jun, 7

Scientific Computing on Hybrid Architectures

Modern computer architectures, with multicore CPUs and GPUs or other accelerators, make stronger demands than ever on writers of scientific code. Normally, the most efficient program has to be written – using a substantial effort – by expert programmers for a certain application on a particular computer. This thesis deals with several algorithmic and technical […]
Jun, 7

CUDA Based Performance Evaluation of the Computational Efficiency of the DCT Image Compression Technique on Both the CPU and GPU

Recent advances in computing such as the massively parallel GPUs (Graphical Processing Units),coupled with the need to store and deliver large quantities of digital data especially images, has brought a number of challenges for Computer Scientists, the research community and other stakeholders. These challenges, such as prohibitively large costs to manipulate the digital data amongst […]
Jun, 6

Parallel Implementation of Finite Element Codes using CUDA

The purpose of this work is to study the performance of parallel computation of Finite Element Method using the NVIDIA’s CUDA. The numerical experiments are performed only on the stiffness matrix using the conjugate gradient method. In addition, the generalized minimal residual method is considered to solve the Stokes problem using both PETSc and CUDA. […]

* * *

* * *

HGPU group © 2010-2025 hgpu.org

All rights belong to the respective authors

Contact us:

contact@hpgu.org