6138

Posts

Oct, 26

Flexible neuronal network simulation framework using code generation for NVidia CUDA

Simulating large scale computer models of brain structures with spiking neuronal networks has become increasingly popular and feasible with the advent of general purpose computing on graphical processing units (GPGPU). Modern graphics cards, such as the NVidia range supporting the common unified device architecture (CUDA) provide massively parallel computing architectures for this purpose. Earlier GPU […]
Oct, 26

Hand Tracking based on Hierarchical Clustering of Range Data

Fast and robust hand segmentation and tracking is an essential basis for gesture recognition and thus an important component for contact-less human-computer interaction (HCI). Hand gesture recognition based on 2D video data has been intensively investigated. However, in practical scenarios purely intensity based approaches suffer from uncontrollable environmental conditions like cluttered background colors. In this […]
Oct, 25

Scalability of Incompressible Flow Computations on Multi-GPU Clusters Using Dual-Level and Tri-Level Parallelism

High performance computing using graphics processing units (GPUs) is gaining popularity in the scientific computing field, with many large compute clusters being augmented with multiple GPUs in each node. We investigate hybrid tri-level (MPI-OpenMP-CUDA) parallel implementations to explore the efficiency and scalability of incompressible flow computations on GPU clusters up to 128 GPUS. This work […]
Oct, 25

Current and Nascent SETI Instruments in the Radio and Optical

Here we describe our ongoing efforts to develop high-performance and sensitive instrumentation for use in the search for extra-terrestrial intelligence (SETI). These efforts include our recently deployed Search for Extraterrestrial Emissions from Nearby Developed Intelligent Populations Spectrometer (SERENDIP V.v) and two instruments currently under development; the Heterogeneous Radio SETI Spectrometer (HRSS) for SETI observations in […]
Oct, 25

Design and Implementation of GPU-Based Prim’s Algorithm

Minimum spanning tree is a classical problem in graph theory that plays a key role in a broad domain of applications. This paper proposes a minimum spanning tree algorithm using Prim’s approach on Nvidia GPU under CUDA architecture. By using new developed GPU-based Min-Reduction data parallel primitive in the key step of the algorithm, higher […]
Oct, 25

Parallel Execution of AES-CTR Algorithm Using Extended Block Size

Data encryption and decryption are common operations in a network based application programs with security. In order to keep pace with the input data rate in such applications, real-time processing of data encryption/decryption is essential. For example, in an environment where a multimedia data is streamed, high speed data encryption/decryption is crucial. In this paper, […]
Oct, 25

The Model of Computation of CUDA and its Formal Semantics

We formalize the model of computation of modern graphics cards based on the specification of Nvidia’s Compute Unified Device Architecture (CUDA). CUDA programs are executed by thousands of threads concurrently and have access to several different types of memory with unique access patterns and latencies. The underlying hardware uses a single instruction, multiple threads execution […]
Oct, 25

BEAGLE: an Application Programming Interface and High-Performance Computing Library for Statistical Phylogenetics

Phylogenetic inference is fundamental to our understanding of most aspects of the origin and evolution of life, and in recent years there has been a concentration of interest in statistical approaches such as Bayesian inference and maximum likelihood estimation. Yet, for large datasets and realistic or interesting models of evolution, these approaches remain computationally demanding. […]
Oct, 25

An Evolutionary Optimization Strategy Using Graphics Processing Units to Efficiently Investigate Gene-Gene Interactions in Genetic Association Studies

The analysis of gene-gene interactions related to common complex human diseases is complicated by the increasing scale of genetic association analysis. Concurrent with the advances in genetic technology that led to these large data sets, improvements have been made in parallel computing with graphics processing units (GPUs). The dataintensive nature of genetic association analysis makes […]
Oct, 25

Distributed GPU Password Cracking Research Project

This research project explores the possiblities of intergrating GPU processing power with a network cluster in order to achieve better performance with respect to password cracking. First a literature study is performed on the field of passwords in general, GPGPU computing and distributed computing through means of middleware. With these building blocks, combined with current […]
Oct, 25

Extending Scala with General Purpose GPU Programming

In this report we document an attempt to make it easier to use powerful GPUs by extending the Scala compiler to automatically offload work to the GPU. We benchmark similar code and find that it provides between 2-3 speedup compared to the CPU alone. Finally we discuss ways to improve the extention to offload more […]
Oct, 25

Development of a Flow Solver with Complex Kinetics on the Graphic Processing Units

The current paper reports on the implementation of a numerical solver on the Graphic Processing Units (GPU) to model reactive gas mixture with detailed chemical kinetics. The solver incorporates high-order finite volume methods for solving the fluid dynamical equations coupled with stiff source terms. The chemical kinetics are solved implicitly via an operator-splitting method. We […]

* * *

* * *

HGPU group © 2010-2025 hgpu.org

All rights belong to the respective authors

Contact us: