12029

Posts

May, 5

Non-separable 2D, 3D and 4D filtering with CUDA

We have presented solutions for fast non-separable floating point convolution in 2, 3 and 4 dimensions, using the CUDA programming language. We believe that these implementations will serve as a complement to the NPP library, which currently only supports 2D filters and images stored as integers. The shared memory implementation with loop unrolling is approximately […]
May, 5

Accelerating Mixed-Abstraction SystemC Models on Multi-Core CPUs and GPUs

Functional verification is a critical part in the hardware design process cycle, and it contributes for nearly two-thirds of the overall development time. With increasing complexity of hardware designs and shrinking time-to-market constraints, the time and resources spent on functional verification has increased considerably. To mitigate the increasing cost of functional verification, research and academia […]
May, 5

Assessing the Performance-Energy Balance of Graphics Processors for Spectral Unmixing

Remotely sensed hyperspectral imaging missions are often limited by onboard power restrictions while, simultaneously, require high computing power in order to address applications with relevant constraints in terms of processing times. In recent years, graphics processing units (GPUs) have emerged as a commodity computing platform suitable to meet real-time processing requirements in hyperspectral image processing. […]
May, 5

GPU-based Parallel Computing for Nonlinear Finite Element Deformation Analysis

Computer-based surgical simulation and non-rigid medical image registration in image-guided interventions are examples of applications that would benefit from real-time deformation simulation of soft tissues. The physics of deformation for biological soft-tissue is best described by nonlinear continuum mechanics-based models which then can be discretized by the Finite Element Method (FEM) for a numerical solution. […]
May, 3

Refresh Rate Modulation for Perceptually Optimized Computer Graphics

The application of human visual perception models to remove imperceptible components in a graphics system, has been proven effective in achieving significant computational speedup. Previous implementations of such techniques have focused on spatial level of detail reduction, which typically results in noticeable degradation of image quality. We introduce Refresh Rate Modulation (RRM), a novel perceptual […]
May, 3

GPU-accelerated ray-tracing for real-time treatment planning

Dose calculation methods in radiotherapy treatment planning require the radiological depth information of the voxels that represent the patient volume to correct for tissue inhomogeneities. This information is acquired by time consuming ray-tracing-based calculations. For treatment planning scenarios with changing geometries and real-time constraints this is a severe bottleneck. We implemented an algorithm for the […]
May, 3

Implementation of a PIC simulation using WebGL

This project’s aim is to find a WebGL based alternative to the Java implementation of OpenPixi, a Java-based Particle-in-Cell (PIC) simulation software, and to add a third dimension. For this purpose, an existing JavaScript library, three.js, was chosen. A handful of approaches are explored and the resulting prototypes are then compared in terms of speed, […]
May, 3

Coalition Structure Generation with the Graphics Processing Unit

Coalition Structure Generation-the problem of finding the optimal division of agents into coalitions-has received considerable attention in recent AI literature. The fastest exact algorithm to solve this problem is IDP-IP* [17], which is a hybrid of two previous algorithms, namely IDP and IP. Given this, it is desirable to speed up IDP as this will, […]
May, 3

A Performance Optimization Support Framework for GPU-based Traffic Simulations with Negotiating Agents

To realize a simulation which can handle hundreds of thousands of negotiating agents keeping their detailed behaviors, massive amount of computational power is required. Also having good programmability of agents’ codes to realize complex behaviors is essential to realize it. On deploying such negotiating agents on an agent simulation, it is important to be able […]
May, 2

Real-time Simultaneous Pose and Shape Estimation for Articulated Objects Using a Single Depth Camera

In this paper we present a novel real-time algorithm for simultaneous pose and shape estimation for articulated objects, such as human beings and animals. The key of our pose estimation component is to embed the articulated deformation model with exponential-maps-based parametrization into a Gaussian Mixture Model. Benefiting from the probabilistic measurement model, our algorithm requires […]
May, 2

3D FFT on a Single FPGA

The 3D FFT is critical in many physical simulations and image processing applications. On FPGAs, however, the 3D FFT was thought to be inefficient relative to other methods such as convolution-based implementations of multigrid. We find the opposite: a simple design, operating at a conservative frequency, takes 4ms for 16^3, 21ms for 32^3, and 215ms […]
May, 2

Analysis of SuperLU Solvers on Intel MIC Architecture

Intel Xeon Phi is a coprocessor with sixty-one cores in a single chip. The chip has a more powerful FPU that contains 512-bit SIMD registers. Intel Xeon Phi chip can benefit from the algorithms that operate with the large vectors. In this work, sequential, multithreaded and distributed versions of SuperLU solvers are tested on the […]

Recent source codes

* * *

* * *

HGPU group © 2010-2025 hgpu.org

All rights belong to the respective authors

Contact us:

contact@hpgu.org