5553

Posts

Sep, 5

Fast Construction of SAH BVHs on the Intel Many Integrated Core (MIC) Architecture

We investigate how to efficiently build bounding volume hierarchies (BVHs) with surface area heuristic (SAH) on the Intel Many Integrated Core (MIC) Architecture. To achieve maximum performance, we use four key concepts: progressive 10-bit quantization to reduce cache footprint with negligible loss in BVH quality; an AoSoA data layout that allows efficient streaming and SIMD […]
Sep, 5

A CUDA-based parallel implementation of K-nearest neighbor algorithm

Recent developments in Graphics Processing Units (GPUs) have enabled inexpensive high performance computing for general-purpose applications. Due to GPU’s tremendous computing capability, it has emerged as the co-processor of the CPU to achieve a high overall throughput. CUDA programming model provides the programmers adequate C language like APIs to better exploit the parallel power of […]
Sep, 5

Real-Time Tone Mapping for High-Resolution HDR Images

High dynamic range rendering attempts to take an HDR image and produce a more realistic representation on a limited range computer monitor. Although several tone mapping operators have been proposed in recent years, no evaluation has yet been undertaken to explore which operator is more suitable for hardware implementation. In this paper, we begin with […]
Sep, 5

DUODECIM – a structure for point scan compression and rendering

In this paper we present a compression scheme for large point scans including per-point normals. For the encoding of such scans we introduce a particular type of closest sphere packing grids, the hexagonal close packing (HCP). HCP grids provide a structure for an optimal packing of 3D space, and for a given sampling error they […]
Sep, 5

Fault table generation using Graphics Processing Units

In this paper, we explore the implementation of fault table generation on a Graphics Processing Unit (GPU). A fault table is essential for fault diagnosis and fault detection in VLSI testing and debug. Generating a fault table requires extensive fault simulation, with no fault dropping, and is extremely expensive from a computational standpoint. Fault simulation […]
Sep, 5

Efficient Execution on GPUs of Field-Based Vehicular Mobility Models

Large-scale scenarios of vehicular traffic simulation problems are characterized by complex queuing effects, control mechanisms and other interactions of the traffic on the control and vice versa. While small-sized scenarios are relatively easy to explore and analyze, larger scenarios need specialized treatment for efficient execution. The simulation challenges of speed and scale become pronounced when […]
Sep, 5

Isocube: Exploiting the Cubemap Hardware

This paper proposes a novel six-face spherical map, isocube, that fully utilizes the cubemap hardware built in most GPUs. Unlike the cubemap, the proposed isocube uniformly samples the unit sphere (uniformly distributed), and all samples span the same solid angle (equally important). Its mapping computation contains only a small overhead. By feeding the cubemap hardware […]
Sep, 5

A CUDA Based Implementation of an Image Authentication Algorithm

Image authentication is an important technology to protect images from being malicious tampered and have became an indispensable part of digital world. The main schemes used for image authentication are signature and watermarking in the last decade. However, in traditional serial manners, the operations of both methods are time-consuming, and limit the wide use of […]
Sep, 5

A near real-time framework for extracting tip-sample forces in dynamic atomic force microscopy (dAFM)

The atomic force microscope (AFM) is a versatile, high-resolution tool used to characterize the topography and material properties of a large variety of specimens at nano-scale. The interaction of the micro-cantilever tip with the specimen causes cantilever deflections that are measured by an optical sensing mechanism and subsequently utilized to construct the sample topography. Recent […]
Sep, 4

Electrical distribution grid visualization using programmable GPUs

Modern graphic cards enable applications to process big amounts of graphical data faster than CPUs, allowing high-volume parallelizable data to be visualized in real-time. In this paper, we present an approach to enable a power grid planning Computer-Aided-Design application to use this processing power to visualize electrical distribution grids in the fastest possible way. As […]
Sep, 4

Computing room acoustics with CUDA – 3D FDTD schemes with boundary losses and viscosity

In seeking to model realistic room acoustics, direct numerical simulation can be employed. This paper presents 3D Finite Difference Time Domain schemes that incorporate losses at boundaries and due to the viscosity of air. These models operate within a virtual room designed on a detailed floor plan. The schemes are computed at 44.1kHz, using large-scale […]
Sep, 4

Parallel Streaming Intra Prediction for Full HD H.264 Encoding

Intra prediction is the most important intensive computing component in H.264 intra frame coder. Its high computational costs give huge pressure to most current embedded programmable processors, especially in real-time HD H.264 video encoding. Stream processing model, an emerging parallel processing model supported by GPUs and most programmable processors, bridges the gap between flexible programmable […]

* * *

* * *

HGPU group © 2010-2024 hgpu.org

All rights belong to the respective authors

Contact us: