7353

Posts

Mar, 9

NLSEmagic: Nonlinear Schrodinger Equation Multidimensional Matlab-based GPU-accelerated Integrators using Compact High-order Schemes

We present a simple to use, yet powerful code package called NLSEmagic to numerically integrate the nonlinear Schroedinger equation in one, two, and three dimensions. NLSEmagic is a high-order finite-difference code package which utilizes graphic processing unit (GPU) parallel architectures. The codes running on the GPU are many times faster than their serial counterparts, and […]
Mar, 8

Scalable framework for mapping streaming applications onto multi-GPU systems

Graphics processing units leverage on a large array of parallel processing cores to boost the performance of a specific streaming computation pattern frequently found in graphics applications. Unfortunately, while many other general purpose applications do exhibit the required streaming behavior, they also possess unfavorable data layout and poor computation-to-communication ratios that penalize any straight-forward execution […]
Mar, 8

Dynamic Task-Scheduling and Resource Management for GPU Accelerators in Medical Imaging

For medical imaging applications, a timely execution of tasks is essential. Hence, running multiple applications on the same system, scheduling with the capability of task preemption and prioritization becomes mandatory. Using GPUs as accelerators in this domain, imposes new challenges since GPU’s common FIFO scheduling does not support task prioritization and preemption. As a remedy, […]
Mar, 8

Compiler Assisted Runtime Adaptation

In this dissertation, we address the problem of runtime adaptation of the application to its execution environment. A typical example is changing theprocessing element on which a computation is executed, considering the available processing elements in the system. This is done based on the information and instrumentation provided by the compiler and taking into account […]
Mar, 8

Paragon: Collaborative Speculative Loop Execution on GPU and CPU

The rise of graphics engines as one of the main parallel platforms for general purpose computing has ignited a wide search for better programming support for GPUs. Due to their non-traditional execution model, developing applications for GPUs is usually very challenging, and as a result, these devices are left under-utilized in many commodity systems. Several […]
Mar, 8

PISTON: A Portable Cross-Platform Framework for Data-Parallel Visualization Operators

Due to the wide variety of current and next-generation supercomputing architectures, the development of highperformance parallel visualization and analysis operators frequently requires re-writing the underlying algorithms for many different platforms. In order to facilitate portability, we have devised a framework for creating such operators that employs the data-parallel programming model. By writing the operators using […]
Mar, 6

Efficient Relational Algebra Algorithms and Data Structures for GPU

Relational databases remain an important application domain for organizing and analyzing the massive volume of data generated as sensor technology, retail and inventory transactions, social media, computer vision, and new fields continue to evolve. At the same time, processor architectures are beginning to shift towards hierarchical and parallel architectures employing throughput-optimized memory systems, lightweight multi-threading, […]
Mar, 6

PartialRC: A Partial Recomputing Method for Efficient Fault Recovery on GPGPUs

GPGPUs are increasingly being used to as performance accelerators for HPC (High Performance Computing) applications in CPU/GPU heterogeneous computing systems, including TianHe-1A, the world’s fastest supercomputer in the TOP500 list, built at NUDT (National University of Defense Technology) last year. However, despite their performance advantages, GPGPUs do not provide built-in fault-tolerant mechanisms to offer reliability […]
Mar, 6

KUDA: GPU Accelerated Split Race Checker

We propose a novel approach for runtime verification on computers with a large number of computation cores, without any hardware extension to mainstream PC environment. The goal of the approach is making use of all hardware resources to decouple the computational overhead of traditional race checkers via parallelizing the runtime verification. We distinguish between two […]
Mar, 6

Synthesizing Software from a ForSyDe Model Targeting GPGPUs

Today, a plethora of parallel execution platforms are available. One platform in particular is the GPGPU – a massively parallel architecture designed for exploiting data parallelism. However, GPGPUS are notoriously difficult to program due to the way data is accessed and processed, and many interconnected factors affect the performance. This makes it an exceptionally challengingtask […]
Mar, 6

High-Performance Distributed Multi-Model / Multi-Kernel Simulations: A Case-Study in Jungle Computing

High-performance scientific applications require more and more compute power. The concurrent use of multiple distributed compute resources is vital for making scientific progress. The resulting distributed system, a so-called Jungle Computing System, is both highly heterogeneous and hierarchical, potentially consisting of grids, clouds, stand-alone machines, clusters, desktop grids, mobile devices, and supercomputers, possibly with accelerators […]
Mar, 2

Parallel Implementation of Similarity Measures on GPU Architecture using CUDA

Image processing and pattern recognition algorithms take more time for execution on a single core processor. Graphics Processing Unit (GPU) is more popular now-a-days due to their speed, programmability, low cost and more inbuilt execution cores in it. Most of the researchers started work to use GPUs as a processing unit with a single core […]

* * *

* * *

HGPU group © 2010-2024 hgpu.org

All rights belong to the respective authors

Contact us: