13738

Posts

Mar, 14

Accelerating DEM simulations on GPUs by reducing the impact of warp divergences

A way to accelerate DEM calculations on the GPUs is developed. We examined how warp divergences take place in the contact detection and the force calculations taking account of the GPU architecture. Then we showed a strategy to reduce the impact of the warp divergences on the runtime of the DEM force calculations.
Mar, 14

2nd International Conference on Multimedia and Communication Technologies (ICMCT2015), 2015

2015 2nd International Conference on Multimedia and Communication Technologies (ICMCT2015) September 19-20, 2015 Hong Kong Organized by American Society for Research (ASR) http://www.icmct.org/ Submission Deadline: 2015-06-05 Topics: Hardware & Software for Multimedia Systems Enabling Technologies for Multimedia Multimedia Applications Consumer Systems and Networks Speech and Audio Processing Image and Video Processing Applied Signal Processing Communication […]
Mar, 14

7th International Conference on Software Technology and Engineering (ICSTE 2015), 2015

2015 7th International Conference on Software Technology and Engineering (ICSTE 2015) September 19-20, 2015 Hong Kong Organized by American Society for Research (ASR) http://www.icste.org/ Submission Deadline: 2015-06-05 Topics: AI and Knowledge based software engineering Object-Oriented Technology Artificial Intelligence Parallel and Distributed Computing Aspect-orientation and feature interaction Patterns and frameworks Business Process Reengineering & Science Process […]
Mar, 14

4th International Conference on Image, Vision and Computing (ICIVC 2015), 2015

2015 4th International Conference on Image, Vision and Computing (ICIVC 2015) http://www.icivc.org/ Date: September 19-20, 2015 Venue: Hong Kong Submission Deadline: 2015-06-05 Topics: Image acquisition Detection and Estimation of Signal Parameters Image processing Signal Identification Medical image processing Nonlinear Signals and Systems Pattern recognition and analysis Time-Frequency Signal Analysis Visualization Signal Reconstruction Image coding and […]
Mar, 12

GPGPU Performance and Power Estimation Using Machine Learning

Graphics Processing Units (GPUs) have numerous configuration and design options, including core frequency, number of parallel compute units (CUs), and available memory bandwidth. At many stages of the design process, it is important to estimate how application performance and power are impacted by these options. This paper describes a GPU performance and power estimation model […]
Mar, 12

Implementing Machine Learning Algorithms on GPUs for Real-Time Traffic Sign Classification

This paper investigates traffic sign classification, which is an important problem to solve for autonomous driving. Linear discriminant analysis and convolutional neural networks achieved an accuracy of 98.25% and 98.75% respectively when classifying eight different types of traffic signs. The CNN was implemented on a GPU for real-time traffic sign classification: testing time for the […]
Mar, 12

CUDA accelerated large scale vehicular area network simulator

Both size and computational activities of Vehicular Area Network (VANET) are growing. Simulation of VANETs not only requires the simulation of network standards, but also the mobility of nodes. Such dynamic system involves computation of node distance, routing protocols, application layer, data send, data receive, etc. The simulation model of VANET requires both hardware and […]
Mar, 12

RadixBoost: A Hardware Acceleration Structure for Scalable Radix Sort on Graphic Processors

In this paper, we propose RadixBoost, a hardware acceleration structure for scalable 32-bit integer radix sort on GPU. The whole structure is integrated into a GPU microarchitecture as a special functional unit and can be started by new instructions. Our design enables a significantly faster sorting procedure for general purpose GPU computing. The RadixBoost architecture […]
Mar, 12

FastTree: A Hardware KD-Tree Construction Acceleration Engine for Real-Time Ray Tracing

The ray tracing algorithm is well-known for its ability to generate photo-realistic rendering effects. Recent years have witnessed a renewed momentum in pushing it to real-time for better user experience. Today the construction of acceleration structures, e.g., kd-tree, has become the bottleneck of ray tracing. A dedicated hardware architecture, FastTree, was proposed for kd-tree construction […]
Mar, 8

HOCL: A Family of Embedded Languages

We address the increasingly varied capabilities of specialized computing platforms by introducing a growing family of functionally-limited mini-languages, implemented as embedded domain specific languages (EDSLs) in Haskell, that may be composed to harness the computational features offered by a variety of hardware platforms. This development is based on a novel modular representation of the EDSL […]
Mar, 8

Converting Data-Parallelism to Task-Parallelism by Rewrites: Purely Functional Programs Across Multiple GPUs

High-level domain-specific languages for array processing on the GPU are increasingly common, but they typically only run on a single GPU. As computational power is distributed across more devices, languages must target multiple devices simultaneously. To this end, we present a compositional translation that fissions data-parallel programs in the Accelerate language, allowing subsequent compiler and […]
Mar, 8

An Empirical Performance Evaluation of GPU-Enabled Graph-Processing Systems

Graph processing is increasingly used in knowledge economies and in science, in advanced marketing, social networking, bioinformatics, etc. A number of graph-processing systems, including the GPU-enabled Medusa and Totem, have been developed recently. Understanding their performance is key to system selection, tuning, and improvement. Previous performance evaluation studies have been conducted for CPU-based graph-processing systems, […]

Recent source codes

* * *

* * *

HGPU group © 2010-2025 hgpu.org

All rights belong to the respective authors

Contact us: