7945

Posts

Jul, 7

Random Walks for Image Cosegmentation

We recast the Cosegmentation problem using Random Walker (RW) segmentation as the core segmentation algorithm, rather than the traditional MRF approach adopted in the literature so far. Our formulation is similar to previous approaches in the sense that it also permits Cosegmentation constraints (which impose consistency between the extracted objects from >= 2 images) using […]
Jul, 7

Implementing Interactive 3D Segmentation on CUDA Using Graph-Cuts and Watershed Transformation

In this paper we present a novel scheme for a very fast implementation of volumetric segmentation using graph cuts. The main benefit of this work is our approach to non-grid region adjacency processing on CUDA which to our knowledge has not been done yet in any efficient way. The watershed transform radically reduces the number […]
Jul, 7

Fast algorithm of ray tracing based on KD-tree structure

According to the GPU storage characteristics, a parallel ray tracing algorithm is proposed in this paper, in which the KD-tree is adopted as the accelerating structure. The nodes are continuously spitted using intermediate plane of each axis, respectively, while the built KD-tree is stored in the texture memory of GPUs. The triangles in a scene […]
Jul, 7

Efficient Parallel RSA Decryption Algorithm for Many-core GPUs with CUDA

Cryptography is an important technique among various applications. In the telecommunication, cryptography is necessary when an untrusted medium is communicated in the network. RSA is a public-key cryptography algorithm to use a pair (N, E) as the public key and D as the private key. The N is the product of two large prime numbers […]
Jul, 7

Parallel Particle Swarm Optimization on Graphical Processing Unit for Pose Estimation

In this paper, we present a parallel implementation of the Particle Swarm Optimization (PSO) on GPU using CUDA. By fully utilizing the processing power of graphic processors, our implementation provides a speedup of 215x compared to a sequential implementation on CPU. This speedup is significantly superior to what has been reported in recent papers and […]
Jul, 6

Sparselet Models for Efficient Multiclass Object Detection

We develop intermediate representations for deformable part models, and show that such representations have favorable performance characteristics for multi-class problems where the number of classes is large. Our model uses sparse coding of part filters to represent each filter as a sparse linear combination of shared dictionary elements. This leads to an universal set of […]
Jul, 6

Parallel Memory Defragmentation on a GPU

High-throughput memory management techniques such as malloc/free or mark-and-sweep collectors often exhibit memory fragmentation leaving allocated objects interspersed with free memory holes. Memory defragmentation removes such holes by moving objects around in memory so that they become adjacent (compaction) and holes can be merged (coalesced) to form larger holes. However, known defragmentation techniques are slow. […]
Jul, 6

Image and Video Processing on GPU: Implementation Scheme, Applications and Future Directions

Most of the recent computer graphic applications are essentially based on multicore general-purpose processors architectures, which include CPUs made up of parallel processors with high elaboration capability. Due to the rapid turning towards high definition multimedia, much more memory space and computational resources are needed to achieve better performance. Recently the GPGPUs (which stands for […]
Jul, 6

High Performance System in GPU and CUDA Media Processing System

This paper focuses on An Overview of High Performance with GPU and CUDA Media Processing System. The GPU ubiquitous graphics processing unit in every PC, laptop, desktop computer, and workstation. In its most basic form, the GPU generates 2D and 3D graphics, images, and video that enable window based operating systems, graphical user interfaces, video […]
Jul, 6

fMRI analysis on the GPU-possibilities and challenges

Functional magnetic resonance imaging (fMRI) makes it possible to non-invasively measure brain activity with high spatial resolution. There are however a number of issues that have to be addressed. One is the large amount of spatio-temporal data that needs to be processed. In addition to the statistical analysis itself, several preprocessing steps, such as slice […]
Jul, 5

GPU-based Assembly of Stiffness Matrices in the Parallel Multilevel Partition of Unity Method

Many real world problems can be modeled with Partial Differential Equations (PDEs). Since for many PDEs no exact solution can be found, there exists a variety of methods which give an approximate solution to those PDEs. One method which can be applied to find an approximate solution for elliptic PDEs is the Parallel Multilevel Partition […]
Jul, 5

A Massively Parallel Adaptive Fast Multipole Method on Heterogeneous Architectures

We describe a parallel fast multipole method (FMM) for highly nonuniform distributions of particles. We employ both distributed memory parallelism (via MPI) and shared memory parallelism (via OpenMP and GPU acceleration) to rapidly evaluate two-body nonoscillatory potentials in three dimensions on heterogeneous high performance computing architectures. We have performed scalability tests with up to 30 […]

Recent source codes

* * *

* * *

HGPU group © 2010-2025 hgpu.org

All rights belong to the respective authors

Contact us: