4184

Posts

May, 22

GPU accelerated fuzzy connected image segmentation by using CUDA

Image segmentation techniques using fuzzy connectedness principles have shown their effectiveness in segmenting a variety of objects in several large applications in recent years. However, one problem of these algorithms has been their excessive computational requirements when processing large image datasets. Nowadays commodity graphics hardware provides high parallel computing power. In this paper, we present […]
May, 22

Improving the Speed of Virtual Rear Projection: A GPU-Centric Architecture

Projection is the only viable way to produce very large displays. Rear projection of large-scale upright displays is often preferred over front projection because of the lack of shadows that occlude the projected image. However, rear projection is not always a feasible option for space and cost reasons. Recent research suggests that many of the […]
May, 22

Improving performance for emergent environments parameter tuning and simulation in games using GPU

Computer games that handle realistic environments are becoming more popular in the game market. Games that make use of natural environments such as the spreading of fire or the flow of water need to be very carefully designed. In order to produce a desired effect of fire or water, a designer needs to try and […]
May, 22

A GPU-based closed frequent itemsets mining algorithm over stream

Closed frequent itemsets are one of several condensed representations of frequent itemsets, which store all the information of frequent itemsets using less space, thus being more suitable for stream mining. This paper considers a problem that to the best of our knowledge has not been addressed, namely, how to use GPU to mine closed frequent […]
May, 22

Optimized GPU Framework for Ultrasound Color Flow Imaging

A GPU framework for ultrasound color flow imaging (CFI) based on auto-correlation is presented. The parallel CFI processing framework implementation is mainly based on CUDA performance features, such as the memory selection strategy, applicable thread structure and high-throughput bandwidth. Parallel convolution algorithm and multi-channel championship algorithm are proposed. This CFI method achieves a frame rate […]
May, 21

A trigger system based on Graphics Processing Unit (GPU)

We discuss the possible use of GPUs (Graphics Processing Unit) in the all-digital trigger and data acquisition (TDAQ) chain of the NA62 experiment at CERN. The exponentially growing interest in using GPUs for general purpose applications is based on the impressive performances achieved (peak performance already exceeding the Teraflop/s), on the high bandwidth to memory […]
May, 21

An architecture design of GPU-accelerated VoD streaming servers with network coding

Graphics processing unit (GPU) has evolved into a general-purpose computing platform. Inspired by the GPU technology advantage, this paper concerns the design and performance evaluation of practical GPU-accelerated server architecture for Video-on-Demand (VoD) services with network coding. Following the proposal of an optimized network coding algorithm based on parallel threads on GPU, a GPU-Accelerated Server […]
May, 21

A comprehensive analysis and parallelization of an image retrieval algorithm

The prevalence of the Internet and cloud computing has made multimedia data, such as image data and video data, become major data types in our daily life. For example, many data-intensive applications, such as health care and video recommendation, involve collecting, indexing and retrieving tera-scale multimedia data every day. With such a huge amount of […]
May, 21

Analyzing throughput of GPGPUs exploiting within-die core-to-core frequency variation

The state-of-the-art general-purpose graphic processing units (GPGPUs) can offer very high computational throughput for general-purpose, highly-parallel applications using hundreds of available on-chip cores. Meanwhile, as technology is scaled down below 65nm, each core’s maximum frequency varies significantly due to increasing within-die variations. This, in turn, diminishes the throughput improvement of GPGPUs through technology scaling because […]
May, 21

Image processing applications on a low power highly parallel SIMD architecture

In this paper, we present and discuss high performance implementation of a wide class of image processing applications on a low-power massively parallel SIMD architecture, the ClearSpeed CSX700. We present parallel implementation results for four classes of image processing applications: feature detection (Harris Corner Detector), stereo vision (a class of SSD like algorithms), model estimation […]
May, 21

Multilevel Granularity Parallelism Synthesis on FPGAs

Recent progress in High-Level Synthesis (HLS) techniques has helped raise the abstraction level of FPGA programming. However implementation and performance evaluation of the HLS-generated RTL, involves lengthy logic synthesis and physical design flows. Moreover, mapping of different levels of coarse grained parallelism onto hardware spatial parallelism affects the final FPGA-based performance both in terms of […]
May, 21

Synthesis of Platform Architectures from OpenCL Programs

The problem of automatically generating hardware modules from a high level representation of an application has been at the research forefront in the last few years. In this paper, we use OpenCL, an industry supported standard for writing programs that execute on multicore platforms and accelerators such as GPUs. Our architectural synthesis tool, SOpenCL (Silicon-OpenCL), […]

* * *

* * *

HGPU group © 2010-2024 hgpu.org

All rights belong to the respective authors

Contact us: