11632

Posts

Mar, 7

Converting Data to Task-Parallelism by Rewrites

High-level domain-specific-languages for array processing on the GPU are increasingly common, but to date they run only on a single GPU. We argue that languages will need to target multiple devices, even simultaneous combinations of GPU/GPU and CPU/GPU. Increased flexibility may be key to making these languages more easily deployable and thus widespread. To this […]
Mar, 7

Exploring High Performance SQL Databases with Graphics Processing Units

This thesis introduces the development of a new GPU-based database to accelerate queries of Digital Humanities data to extract document texts that are then data-mined to produce visualizations of aspects of the humanities data. The goal is to advance the state-of-the-art in massively parallel database work by investigating methods for utilizing graphical processing units in […]
Mar, 7

Interactive Program Debugging and Optimization for Directive-Based, Efficient GPU Computing

Directive-based GPU programming models are gaining momentum, since they transparently relieve programmers from dealing with complexity of low-level GPU programming, which often reflects the underlying architecture. However, too much abstraction in directive models puts a significant burden on programmers for debugging applications and tuning performance. In this paper, we propose a directive-based, interactive program debugging […]
Mar, 7

Parallelization of DNA alignment algorithms using GPUs

Since the discovery of Deoxyribonucleic Acid (DNA) significant technological advances were made, leading to very large amounts of data gathered for analysis. The tools for this analysis however have advanced at a slower pace and have become one of the limiting factors of new discoveries in this field of research. Recently, from the 3D game […]
Mar, 7

Dynamic Workload Division in GPU-CPU Heterogeneous Systems

GPU provides powerful computational capabilities and huge potential optimization possibility of efficient. As a result, the CPU-GPU heterogeneous architecture is still the hot zone of the high performance computation. However, the energy consuming is still the bottle neck of the entire the system, when the system and its corresponding framework need massive scale calculation. Most […]
Mar, 7

2014 the 4th International Conference on Computer and Communication Devices, ICCCD 2014

2014-05-20 All accepted papers will be published in the volume of International Journal of Computer Theory and Engineering (IJCTE), and will be included in the Electronic Journals Library, EBSCO, Engineering & Technology Digital Library, Google Scholar, INSPEC, Ulrich’s Periodicals Directory, Crossref, ProQuest, WorldCat, and EI (INSPEC, IET). Analog and Mixed-Signal IC Design and Testing RF […]
Mar, 6

OCCA: A unified approach to multi-threading languages

The inability to predict lasting languages and architectures led us to develop OCCA, a C++ library focused on host-device interaction. Using run-time compilation and macro expansions, the result is a novel single kernel language that expands to multiple threading languages. Currently, OCCA supports device kernel expansions for the OpenMP, OpenCL, and CUDA platforms. Computational results […]
Mar, 6

Hybrid Framework for pairwise DNA Sequence Alignment Using the CUDA compatible GPU

This paper provides a novel framework for accelerating the solution of the pairwise DNA sequence alignment problem using CUDA parallel paradigm available on the NVIDIA GPU. The main idea is to implement a new algorithm that assigns different nucleotide weights using GPU architectures then merge the subsequences of match using CPU to get the optimum […]
Mar, 6

Code Optimization and Scaling of the Astrophysics Software Gadget on Intel Xeon Phi

The whitepaper reports our investigation into the porting, optimization and subsequent performance of the astrophysics software package GADGET, on the Intel Xeon Phi. The GADGET code is intended for cosmological N-body/SPH simulations to solve a wide range of astrophysical tasks. The test cases within the project were simulations of galaxy systems. A performance analysis of […]
Mar, 6

Performance Tradeoff Spectrum of Integer and Floating Point Applications Kernels on Various GPUs

Floating point precision and performance and the ratio of floating point units to integer processing elements on a graphics processing unit accelerator all continue to present complex tradeoffs for optimising core utilisation on modern devices. We investigate various hybrid CPU and GPU combinations using a range of different GPU models occupying different points in this […]
Mar, 6

Performance Analysis for GPU-based Ray-triangle Algorithms

Several algorithms have been proposed during the past years to solve the ray-triangle intersection test. In this paper we collect the most prominent solutions and describe how to parallelize them on modern programmable graphics processing units (GPUs) by means of NVIDIA CUDA. This paper also provides a comprehensive performance analysis based on several optional features […]
Mar, 6

Efficient and Scalable Parallel Zonal Statistics on Large-Scale Species Occurrence Data on GPUs

Analyzing how species are distributed on the Earth has been one of the fundamental questions in the intersections of environmental sciences, geosciences and biological sciences. With world-wide data contributions, more than 375 million species occurrence records for nearly 1.5 million species have been deposited to the Global Biodiversity Information Facility (GBIF) data portal. The sheer […]

Recent source codes

* * *

* * *

HGPU group © 2010-2025 hgpu.org

All rights belong to the respective authors

Contact us:

contact@hpgu.org