13950

Posts

May, 12

Improving CUDA DNA Analysis Software with Genetic Programming

We genetically improve BarraCUDA using a BNF grammar incorporating C scoping rules with GP. Barracuda maps next generation DNA sequences to the human genome using the Burrows-Wheeler algorithm (BWA) on nVidia Tesla parallel graphics hardware (GPUs). GI using phenotypic tabu search with manually grown code can graft new features giving more than 100 fold speed […]
May, 12

CVPI: A Computer Vision Library For Mobile and Embedded Platforms

CVPI is a library for implementing computer vision programs on computers supporting OpenVG. It adds additional image processing capabilities to OpenVG that are necessary for computer vision, as well a as providing an interface to setup the rendering environment. OpenVG is a hardware accelerated C API for vector and raster 2D graphics. It is widely […]
May, 12

Development of Parallel Architectures for Radar/Video Signal Processing Applications

The applications of digital signal processing continue to expand and use in many different areas such as signal processing, radar tracking, image processing, medical imaging, video broadcasting, and control algorithms for sensor array processing. Most of the signal processing applications are intensive and may not achieve the real time requirements. However, the Multi-core phenomenon has […]
May, 12

Density Estimations for Approximate Query Processing on SIMD Architectures

Approximate query processing (AQP) is an interesting alternative for exact query processing. It is a tool for dealing with the huge data volumes where response time is more important than perfect accuracy (this is typically the case during initial phase of data exploration). There are many techniques for AQP, one of them is based on […]
May, 12

FPGA-Based Design of Numerical Algorithms for Kernel Density Estimation Using High Level Synthesis Approach

FPGA technology can offer significantly higher performance at much lower power than is available from CPUs and GPUs in many computational problems. Unfortunately, programming for FPGA (using hardware description languages, HDL) is a difficult and not-trivial task and is not intuitive for C/C++/Java programmers. To bring the gap between programming effectiveness and difficulty the High […]
May, 10

Age and Gender Classification using Convolutional Neural Networks

Automatic age and gender classification has become relevant to an increasing amount of applications, particularly since the rise of social platforms and social media. Nevertheless, performance of existing methods on real-world images is still significantly lacking, especially when compared to the tremendous leaps in performance recently reported for the related task of face recognition. In […]
May, 10

Numerical Simulation of Melting with Natural Convection Based on Lattice Boltzmann Method and Performed with CUDA Enabled GPU

A new solver is developed to numerically simulate the melting phase change with natural convection. This solver was implemented on a single Nvidia GPU based on the CUDA technology in order to simulate the melting phase change in a 2D rectangular enclosure. The Rayleigh number is of the order of magnitude of 108 and Prandlt […]
May, 10

Tracking Many Solution Paths of a Polynomial Homotopy on a Graphics Processing Unit

Polynomial systems occur in many areas of science and engineering. Unlike general nonlinear systems, the algebraic structure enables to compute all solutions of a polynomial system. We describe our massive parallel predictor-corrector algorithms to track many solution paths of a polynomial homotopy. The data parallelism that provides the speedups stems from the evaluation and differentiation […]
May, 10

GPU-accelerated micromagnetic simulations using cloud computing

Highly-parallel graphics processing units (GPUs) can improve the speed of micromagnetic simulations significantly as compared to conventional computing using central processing units (CPUs). We present a strategy for performing GPU-accelerated micromagnetic simulations by utilizing cost-effective GPU access offered by cloud computing services with an open-source Python-based program for running the MuMax3 micromagnetics code remotely. We […]
May, 10

GPU Ray-Traced Collision Detection: Fine Pipeline Reorganization

Ray-tracing algorithms can be used to render a virtual scene and to detect collisions between objects. Numerous ray-tracing algorithms have been proposed which use data structures optimized for specific cases (rigid objects, deformable objects, etc.). Some solutions try to optimize performance by combining several algorithms to use the most efficient algorithm for each ray. This […]
May, 7

SparkCL: A Unified Programming Framework for Accelerators on Heterogeneous Clusters

We introduce SparkCL, an open source unified programming framework based on Java, OpenCL and the Apache Spark framework. The motivation behind this work is to bring unconventional compute cores such as FPGAs/GPUs/APUs/DSPs and future core types into mainstream programming use. The framework allows equal treatment of different computing devices under the Spark framework and introduces […]
May, 7

Supporting input dependent access pattern algorithms on GPUs using GPUfs

Accelerating processing of very large datasets on GPUs is challenging, in particular when algorithms exhibit unpredictable data access patterns. In this paper we investigate the utility of GPUfs, a library that provides direct access to files from GPU programs, to implement such algorithms. We analyze the system’s bottlenecks, and suggest several modification to the GPUfs […]

* * *

* * *

HGPU group © 2010-2025 hgpu.org

All rights belong to the respective authors

Contact us: