5865

Posts

Oct, 3

Towards robust automatic detection of vulnerable road users: monocular pedestrian tracking from a moving vehicle

In this paper we present steps towards the automatic detection of vulnerable road users in video. Such a system can e.g. be used as an automatic blind spot camera for trucks. The aim of the system is to automatically warn the driver when the algorithm detects vulnerable road users in the camera images. Such an […]
Oct, 3

An Auto-tuning Solution to Data Streams Clustering in OpenCL

Due to its applicability to numerous types of data, including telephone records, web documents, and click streams, the data stream model has recently attracted attention. For analysis of such data, it is crucial to process the data in a single pass, or a small number of passes, using little memory. This paper provides an OpenCL […]
Oct, 2

A New Class of Parallel Scheduling Algorithms

The main issue discussed in this book is concerned with solving job scheduling problems in parallel calculating environments, such as multiprocessor computers, clusters or distributed calculation nodes in networks, by applying algorithms which use various parallelization technologies starting from multiple calculation threads (multithread technique) up to distributed calculation processes. Strongly sequential character of the scheduling […]
Oct, 2

Hardware/Software Co-design for Energy-Efficient Seismic Modeling

Reverse Time Migration (RTM) has become the standard for high-quality imaging in the seismic industry. RTM relies on PDE solutions using stencils that are 8th order or larger, which require large-scale HPC clusters to meet the computational demands. However, the rising power consumption of conventional cluster technology has prompted investigation of architectural alternatives that offer […]
Oct, 2

Power Management and Optimization

After many years of focusing on "faster" computers, people have started taking notice of the fact that the race for "speed" has had the unfortunate side effect of increasing the total power consumed, thereby increasing the total cost of ownership of these machines. The heat produced has required expensive cooling facilities. As a result, it […]
Oct, 2

Development of a Chemically Reacting Flow Solver on the Graphic Processing Units

The focus of the current research is to develop a numerical framework on the Graphic Processing Units (GPU) capable of modeling chemically reacting flow. The framework incorporates a high-order finite volume method coupled with an implicit solver for the chemical kinetics. Both the fluid solver and the kinetics solver are designed to take advantage of […]
Oct, 2

An Execution Model and Runtime For Heterogeneous Many-Core Systems

The emergence of heterogeneous and many-core architectures presents a unique opportunity to deliver order of magnitude performance increases to high performance applications by matching certain classes of algorithms to specifically tailored architectures. However, their ubiquitous adoption has been limited by a lack of programming models and management frameworks designed to reduce the high degree of […]
Oct, 2

Power-Efficient Accelerators for High-Performance Applications

Computers, regardless of their function, are always better if they can operate more quickly. The addition of computation resources allows for improved response times, greater functionality and more flexibility. The drawback with improving a computer’s performance, however, is that it often comes at the cost of power and energy consumption. For many platforms, this is […]
Oct, 2

PTask: Operating System Abstractions To Manage GPUs as Compute Devices

We propose a new set of OS abstractions to support GPUs and other accelerator devices as first class computing resources. These new abstractions, collectively called the PTask API, support a data flow programming model. Because a PTask graph consists of OS-managed objects, the kernel has sufficient visibility and control to provide system-wide guarantees like fairness […]
Oct, 2

A Complete Descritpion of the UnPython and Jit4GPU Framework

A new compilation framework enables the execution of numerical-intensive applications in an execution environment that is formed by multi-core Central Processing Units (CPUs) and Graphics Processing Units (GPUs). A critical innovation is the use of a variation of Linear Memory Access Descriptors (LMADs) to analyze loop nests and determine automatically which memory locations must be […]
Oct, 2

Parallel probabilistic model checking on general purpose graphics processors

We present algorithms for parallel probabilistic model checking on general purpose graphic processing units (GPGPUs). Our improvements target the numerical components of the traditional sequential algorithms. In particular, we capitalize on the fact that in most of them operations like matrix-vector multiplication and solving systems of linear equations are the main complexity bottlenecks. Since linear […]
Oct, 2

CHPS: An Environment for Collaborative Execution on Heterogeneous Desktop Systems

Modern commodity desktop computers equipped with multi-core Central Processing Units (CPUs) and specialized but programmable co-processors are capable of providing a remarkable computational performance. However, approaching this performance is not a trivial task as it requires the coordination of architecturally different devices for cooperative execution. Coordinating the use of the full set of processing units […]

* * *

* * *

HGPU group © 2010-2024 hgpu.org

All rights belong to the respective authors

Contact us: