5861

Posts

Oct, 2

Power Management and Optimization

After many years of focusing on "faster" computers, people have started taking notice of the fact that the race for "speed" has had the unfortunate side effect of increasing the total power consumed, thereby increasing the total cost of ownership of these machines. The heat produced has required expensive cooling facilities. As a result, it […]
Oct, 2

Development of a Chemically Reacting Flow Solver on the Graphic Processing Units

The focus of the current research is to develop a numerical framework on the Graphic Processing Units (GPU) capable of modeling chemically reacting flow. The framework incorporates a high-order finite volume method coupled with an implicit solver for the chemical kinetics. Both the fluid solver and the kinetics solver are designed to take advantage of […]
Oct, 2

An Execution Model and Runtime For Heterogeneous Many-Core Systems

The emergence of heterogeneous and many-core architectures presents a unique opportunity to deliver order of magnitude performance increases to high performance applications by matching certain classes of algorithms to specifically tailored architectures. However, their ubiquitous adoption has been limited by a lack of programming models and management frameworks designed to reduce the high degree of […]
Oct, 2

Power-Efficient Accelerators for High-Performance Applications

Computers, regardless of their function, are always better if they can operate more quickly. The addition of computation resources allows for improved response times, greater functionality and more flexibility. The drawback with improving a computer’s performance, however, is that it often comes at the cost of power and energy consumption. For many platforms, this is […]
Oct, 2

PTask: Operating System Abstractions To Manage GPUs as Compute Devices

We propose a new set of OS abstractions to support GPUs and other accelerator devices as first class computing resources. These new abstractions, collectively called the PTask API, support a data flow programming model. Because a PTask graph consists of OS-managed objects, the kernel has sufficient visibility and control to provide system-wide guarantees like fairness […]
Oct, 2

A Complete Descritpion of the UnPython and Jit4GPU Framework

A new compilation framework enables the execution of numerical-intensive applications in an execution environment that is formed by multi-core Central Processing Units (CPUs) and Graphics Processing Units (GPUs). A critical innovation is the use of a variation of Linear Memory Access Descriptors (LMADs) to analyze loop nests and determine automatically which memory locations must be […]
Oct, 2

Parallel probabilistic model checking on general purpose graphics processors

We present algorithms for parallel probabilistic model checking on general purpose graphic processing units (GPGPUs). Our improvements target the numerical components of the traditional sequential algorithms. In particular, we capitalize on the fact that in most of them operations like matrix-vector multiplication and solving systems of linear equations are the main complexity bottlenecks. Since linear […]
Oct, 2

CHPS: An Environment for Collaborative Execution on Heterogeneous Desktop Systems

Modern commodity desktop computers equipped with multi-core Central Processing Units (CPUs) and specialized but programmable co-processors are capable of providing a remarkable computational performance. However, approaching this performance is not a trivial task as it requires the coordination of architecturally different devices for cooperative execution. Coordinating the use of the full set of processing units […]
Oct, 1

Parallel Computing based on GPGPU using Compute Unified Device Architecture

The demand of processing a huge amount of data within a limited time and the developing of computing capability of Graphic Process Unit (GPU) lead us to the world of parallel computing on General Purpose GPU (GPGPU). Because of the exposed parallelism, GPGPU could assign processing tasks to multiple threads and execute these threads simultaneously. […]
Oct, 1

Correctly Treating Synchronizations in Compiling Fine-Grained SPMD-Threaded Programs for CPU

Automatic compilation for multiple types of devices is important, especially given the current trends towards heterogeneous computing. This paper concentrates on some issues in compiling fine-grained SPMD-threaded code (e.g., GPU CUDA code) for multicore CPUs. It points out some correctness pitfalls in existing techniques, particularly in their treatment to implicit synchronizations. It then describes a […]
Oct, 1

Optimizing a Semantic Comparator using CUDA-enabled Graphics Hardware

Emerging semantic search techniques require fast comparison of large "concept trees". This paper addresses the challenges involved in fast computation of similarity between two large concept trees using a CUDA-enabled GPGPU co-processor. We propose efficient techniques for the same using fast hash computations, membership tests using Bloom Filters and parallel reduction. We show how a […]
Oct, 1

Customizable Domain-Specific Computing

To meet computing needs and overcome power density limitations, the computing industry has entered the era of parallelization. However, highly parallel, general-purpose computing systems face serious challenges in terms of performance, energy, heat dissipation, space, and cost. We believe that there is significant opportunity to look beyond parallelization and focus on domain-specific customization to bring […]

* * *

* * *

HGPU group © 2010-2024 hgpu.org

All rights belong to the respective authors

Contact us: