Posts
May, 5
Non-separable 2D, 3D and 4D filtering with CUDA
We have presented solutions for fast non-separable floating point convolution in 2, 3 and 4 dimensions, using the CUDA programming language. We believe that these implementations will serve as a complement to the NPP library, which currently only supports 2D filters and images stored as integers. The shared memory implementation with loop unrolling is approximately […]
May, 5
Accelerating Mixed-Abstraction SystemC Models on Multi-Core CPUs and GPUs
Functional verification is a critical part in the hardware design process cycle, and it contributes for nearly two-thirds of the overall development time. With increasing complexity of hardware designs and shrinking time-to-market constraints, the time and resources spent on functional verification has increased considerably. To mitigate the increasing cost of functional verification, research and academia […]
May, 5
Assessing the Performance-Energy Balance of Graphics Processors for Spectral Unmixing
Remotely sensed hyperspectral imaging missions are often limited by onboard power restrictions while, simultaneously, require high computing power in order to address applications with relevant constraints in terms of processing times. In recent years, graphics processing units (GPUs) have emerged as a commodity computing platform suitable to meet real-time processing requirements in hyperspectral image processing. […]
May, 5
GPU-based Parallel Computing for Nonlinear Finite Element Deformation Analysis
Computer-based surgical simulation and non-rigid medical image registration in image-guided interventions are examples of applications that would benefit from real-time deformation simulation of soft tissues. The physics of deformation for biological soft-tissue is best described by nonlinear continuum mechanics-based models which then can be discretized by the Finite Element Method (FEM) for a numerical solution. […]
May, 3
Refresh Rate Modulation for Perceptually Optimized Computer Graphics
The application of human visual perception models to remove imperceptible components in a graphics system, has been proven effective in achieving significant computational speedup. Previous implementations of such techniques have focused on spatial level of detail reduction, which typically results in noticeable degradation of image quality. We introduce Refresh Rate Modulation (RRM), a novel perceptual […]
May, 3
GPU-accelerated ray-tracing for real-time treatment planning
Dose calculation methods in radiotherapy treatment planning require the radiological depth information of the voxels that represent the patient volume to correct for tissue inhomogeneities. This information is acquired by time consuming ray-tracing-based calculations. For treatment planning scenarios with changing geometries and real-time constraints this is a severe bottleneck. We implemented an algorithm for the […]
May, 3
Implementation of a PIC simulation using WebGL
This project’s aim is to find a WebGL based alternative to the Java implementation of OpenPixi, a Java-based Particle-in-Cell (PIC) simulation software, and to add a third dimension. For this purpose, an existing JavaScript library, three.js, was chosen. A handful of approaches are explored and the resulting prototypes are then compared in terms of speed, […]
May, 3
Coalition Structure Generation with the Graphics Processing Unit
Coalition Structure Generation-the problem of finding the optimal division of agents into coalitions-has received considerable attention in recent AI literature. The fastest exact algorithm to solve this problem is IDP-IP* [17], which is a hybrid of two previous algorithms, namely IDP and IP. Given this, it is desirable to speed up IDP as this will, […]
May, 3
A Performance Optimization Support Framework for GPU-based Traffic Simulations with Negotiating Agents
To realize a simulation which can handle hundreds of thousands of negotiating agents keeping their detailed behaviors, massive amount of computational power is required. Also having good programmability of agents’ codes to realize complex behaviors is essential to realize it. On deploying such negotiating agents on an agent simulation, it is important to be able […]
May, 2
Real-time Simultaneous Pose and Shape Estimation for Articulated Objects Using a Single Depth Camera
In this paper we present a novel real-time algorithm for simultaneous pose and shape estimation for articulated objects, such as human beings and animals. The key of our pose estimation component is to embed the articulated deformation model with exponential-maps-based parametrization into a Gaussian Mixture Model. Benefiting from the probabilistic measurement model, our algorithm requires […]
May, 2
3D FFT on a Single FPGA
The 3D FFT is critical in many physical simulations and image processing applications. On FPGAs, however, the 3D FFT was thought to be inefficient relative to other methods such as convolution-based implementations of multigrid. We find the opposite: a simple design, operating at a conservative frequency, takes 4ms for 16^3, 21ms for 32^3, and 215ms […]
May, 2
Analysis of SuperLU Solvers on Intel MIC Architecture
Intel Xeon Phi is a coprocessor with sixty-one cores in a single chip. The chip has a more powerful FPU that contains 512-bit SIMD registers. Intel Xeon Phi chip can benefit from the algorithms that operate with the large vectors. In this work, sequential, multithreaded and distributed versions of SuperLU solvers are tested on the […]