high performance computing on graphics processing units: hgpu.org

Posts

Jun, 5

GPU Accelerated Nature Inspired Methods for Modelling Large Scale Bi-Directional Pedestrian Movement

Pedestrian movement, although ubiquitous and well-studied, is still not that well under-stood due to the complicating nature of the embedded social dynamics. Interest among researchers in simulating the nature of pedestrian movement and interactions has grown significantly in part due to increased computational and visualization capabilities afforded by high power computing. Different approaches have been […]

CUDA

Jun, 5

Mapping parallel programs to heterogeneous multi-core systems

Heterogeneous computer systems are ubiquitous in all areas of computing, from mobile to high-performance computing. They promise to deliver increased performance at lower energy cost than purely homogeneous, CPU-based systems. In recent years GPU-based heterogeneous systems have become increasingly popular. They combine a programmable GPU with a multi-core CPU. GPUs have become flexible enough to […]

OpenCL

Jun, 5

A Comparative Study of Game Tree Searching Methods

In this paper, a comprehensive survey on gaming tree searching methods that can use to find the best move in two players zero-sum computer games was introduced. The purpose of this paper is to discuss, compares and analyzes various sequential and parallel algorithms of gaming tree, including some enhancement for them. Furthermore, a number of […]

CUDA

Jun, 5

Adventures in the microlensing cloud: large datasets, eResearch tools, and GPUs

As astronomy enters the petascale data era, astronomers are faced with new challenges relating to storage, access and management of data. A shift from the traditional approach of combining data and analysis at the desktop to the use of remote services, pushing the computation to the data, is now underway. In the field of cosmological […]

CUDA

Jun, 4

3rd International Conference on Communication and Electronics Information, ICCEI 2015

Submission Deadline: 2014-08-25 Publication: All ICCEI 2015 papers will be published in the International Journal of Information and Electronics Engineering (ISSN: 2010-3719), and all papers will be indexed by Google Scholar, EBSCO, Electronic Journals Library, Engineering & Technology Digital Library, Crossref and ProQuest, DOAJ, Ei (INSPEC, IET). Call for Paper: Electronics and Communications Engineering Telecommunication […]

Jun, 4

7th International Conference on Computer and Automation Engineering, ICCAE 2015

Submission Deadline: 2014-08-20 Publication: All papers for the ICCAE 2015 will be published in Journal of Automation and Control Engineering (JOACE) (ISSN: 2301-3702) as one volume, and will be indexed by EI (INSPEC, IET), Ulrich’s Periodicals Directory, Google Scholar, EBSCO, Engineering & Technology Digital Library and Electronic Journals Digital Library. Topic: T1: Modern and Advanced […]

Jun, 3

Utilizing Graphics Processing Units for Rapid Facial Recognition Using Video Input

Facial recognition is an active research area that provides a real-time application of pattern recognition techniques. Input can be provided to recognition algorithms using both static images and video data. However, there are significant challenges to working with live streaming data as the recognition method needs to keep up with the frame rate of the […]

CUDA

Jun, 3

Design Tools for Accelerating Development and Usage of Multi-Core Computing Platforms

Multicore computing technologies are critical to high performance embedded systems. Such technologies are advancing rapidly in terms of the diversity of available multicore platforms, and the scale and heterogeneity of computing resources available on multicore-equipped devices. However, development of high performance signal processing software for multicore computing platforms is a complex process. Due to this […]

CUDA

Jun, 3

Revisiting Edge and Node Parallelism for Dynamic GPU Graph Analytics

Betweenness Centrality is a widely used graph analytic that has applications such as finding influential people in social networks, analyzing power grids, and studying protein interactions. However, its complexity makes its exact computation infeasible for large graphs of interest. Furthermore, networks tend to change over time, invalidating previously calculated results and encouraging new analyses regarding […]

CUDA

Jun, 3

Visualization Tool for GPGPU Programming

The running times of some sequential programs could be greatly reduced by converting and running its parallelizable, time dominant code on a massively, parallel processor architecture. Example program application areas include: bioinformatics, molecular dynamics, video and image processing, signal and audio processing, medical imaging, and cryptography. A low cost, low power, parallel computing platform for […]

CUDA

•

OpenCL

Jun, 3

Cofactorization on Graphics Processing Units

We show how the cofactorization step, a compute-intensive part of the relation collection phase of the number field sieve (NFS), can be farmed out to a graphics processing unit. Our implementation on a GTX 580 GPU, which is integrated with a state-of-the-art NFS implementation, can serve as a cryptanalytic co-processor for several Intel i7-3770K quad-core […]

CUDA

Jun, 2

Loo.py: transformation-based code generation for GPUs and CPUs

Today’s highly heterogeneous computing landscape places a burden on programmers wanting to achieve high performance on a reasonably broad cross-section of machines. To do so, computations need to be expressed in many different but mathematically equivalent ways, with, in the worst case, one variant per target machine. Loo.py, a programming system embedded in Python, meets […]

OpenCL

* * *

high performance computing on graphics processing units: hgpu.org

Posts

GPU Accelerated Nature Inspired Methods for Modelling Large Scale Bi-Directional Pedestrian Movement

Mapping parallel programs to heterogeneous multi-core systems

A Comparative Study of Game Tree Searching Methods

Adventures in the microlensing cloud: large datasets, eResearch tools, and GPUs

3rd International Conference on Communication and Electronics Information, ICCEI 2015

7th International Conference on Computer and Automation Engineering, ICCAE 2015

Utilizing Graphics Processing Units for Rapid Facial Recognition Using Video Input

Design Tools for Accelerating Development and Usage of Multi-Core Computing Platforms

Revisiting Edge and Node Parallelism for Dynamic GPU Graph Analytics

Visualization Tool for GPGPU Programming

Cofactorization on Graphics Processing Units

Loo.py: transformation-based code generation for GPUs and CPUs

Recent source codes

Specx: Speculative task-based runtime system

Mutual-Supervised Learning for Sequential-to-Parallel Code Translation

KISim: Kubernetes Intelligent Scheduling Simulator

Hardware Compute Partitioning on NVIDIA GPUs for Composable Systems

Efficient GPU Implementation of Multi-Precision Integer Division

ParEval: A Parallel Code Evaluation Benchmark

FlashSparse: Minimizing Computation Redundancy for Fast Sparse Matrix Multiplications on Tensor Cores

exa-AMD: Exascale Accelerated Materials Discovery

WiLLM: An Open Wireless LLM Communication System

Vcc: the Vulkan Clang Compiler

Most viewed papers (last 30 days)