5591

Posts

Sep, 9

CUDACS: securing the cloud with CUDA-enabled secure virtualization

While on the one hand unresolved security issues pose a barrier to the widespread adoption of cloud computing technologies, on the other hand the computing capabilities of even commodity HW are boosting, in particular thanks to the adoption of *-core technologies. For instance, the Nvidia Compute Unified Device Architecture (CUDA) technology is increasingly available on […]
Sep, 9

KAdvice: infering synchronization patterns from an existing codebase

Operating system kernels are complex software systems. The kernels of todays mainstream OSs, such as Linux or Windows, are composed from a number of modules, which contain code and data. Even when providing synchronous interfaces (APIs) to the programmer, large portions of the OS kernel operate in an asynchronous manner. Synchronizing access to kernel data […]
Sep, 9

Attaining system performance points: revisiting the end-to-end argument in system design for heterogeneous many-core systems

Trends indicate a rapid increase in the number of cores on chip, exhibiting various types of performance and functional asymmetries present in hardware to gain scalability with balanced power vs. performance requirements. This poses new challenges in platform resource management, which are further exacerbated by the need for runtime power budgeting and by the increased […]
Sep, 9

The architecture of the DecentVM: towards a decentralized virtual machine for many-core computing

Fully decentralized systems avoid bottlenecks and single points of failure. Thus, they can provide excellent scalability and very robust operation. The DecentVM is a fully decentralized, distributed virtual machine. Its simplified instruction set allows for a small VM code footprint. Its partitioned global address space (PGAS) memory model helps to easily create a single system […]
Sep, 9

Piccolo: building fast, distributed programs with partitioned tables

Piccolo is a new data-centric programming model for writing parallel in-memory applications in data centers. Unlike existing data-flow models, Piccolo allows computation running on different machines to share distributed, mutable state via a key-value table interface. Piccolo enables efficient application implementations. In particular, applications can specify locality policies to exploit the locality of shared state […]
Sep, 8

Parallel Programming on a Soft-Core Based Multi-core System

Soft-core system allows designers to modify the components which are in the architecture they designed conveniently. In some systems, uni-core processor can not provide enough computing power to support a huge amount of computing for specific applications. In order to improve the performance of a multi-core system, in addition to the hardware architecture design, parallel […]
Sep, 8

Fast Ultrasound Image Simulation Using the Westervelt Equation

The simulation of ultrasound wave propagation is of high interest in fields as ultrasound system development and therapeutic ultrasound. From a computational point of view the requirements for realistic simulations are immense with processing time reaching even an entire day. In this work we present a framework for fast ultrasound image simulation covering the imaging […]
Sep, 8

Experiences with Mapping Non-linear Memory Access Patterns into GPUs

Modern Graphics Processing Units (GPU) are very powerful computational systems on a chip. For this reason there is a growing interest in using these units as general purpose hardware accelerators (GPGPU). To facilitate the programming of general purpose applications, NVIDIA introduced the CUDA programming environment. CUDA provides a simplified abstraction of the underlying complex GPU […]
Sep, 8

A Fast GPU Implementation for Solving Sparse Ill-Posed Linear Equation Systems

Image reconstruction, a very compute-intense process in general, can often be reduced to large linear equation systems represented as sparse under-determined matrices. Solvers for these equation systems (not restricted to image reconstruction) spend most of their time in sparse matrix-vector multiplications (SpMV). In this paper we will present a GPU-accelerated scheme for a Conjugate Gradient […]
Sep, 8

Programming Many-Core Chips

This book presents new concepts, techniques and promising programming models for designing software for chips with "many" (hundreds to thousands) processor cores. Given the scale of parallelism inherent to these chips, software designers face new challenges in terms of operating systems, middleware and applications. This will serve as an invaluable, single-source reference to the state-of-the-art […]
Sep, 8

GPU Computation in Bioinspired Algorithms: A Review

Bioinspired methods usually need a high amount of computational resources. For this reason, parallelization is an interesting alternative in order to decrease the execution time and to provide accurate results. In this sense, recently there has been a growing interest in developing parallel algorithms using graphic processing units (GPU) also refered as GPU computation. Advances […]
Sep, 8

Towards GPGPU Assisted Computing in Virtualized Environments

General Purpose Computation on Graphics Processing Units (GPGPU) makes it possible to use the massive computing power of modern graphics cards for generic high-performance computing. However, the new virtualization technologies will typically not support high-performance graphics cards and as a consequence GPGPU resources can not be used in typical virtualization setups. In this paper we […]

* * *

* * *

HGPU group © 2010-2025 hgpu.org

All rights belong to the respective authors

Contact us:

contact@hpgu.org