high performance computing on graphics processing units: hgpu.org

Posts

Feb, 4

A Performance Analysis Framework for Optimizing OpenCL Applications on FPGAs

Recently, FPGA vendors such as Altera and Xilinx have released OpenCL SDK for programming FPGAs. However, the architecture of FPGA is significantly different from that of CPU/GPU, for which OpenCL is originally designed. Tuning the OpenCL code for good performance on FPGAs is still an open problem, since the existing OpenCL tools and models designed […]

OpenCL

Feb, 4

Workshop on Exascale Multi/Many Core Computing Systems, 2016

* CONTEXT Exascale computing will revolutionize computational science and engineering by providing 1000x the capabilities of currently available computing systems, while having a similar power footprint. The HPC community is working towards the development of the first Exaflop computer after reaching the Petaflop milestone in 2008. There are concerns that computer designs based on existing […]

Feb, 3

Optimization and Large Scale Computation of an Entropy-Based Moment Closure

We present computational advances and results in the implementation of an entropy-based moment closure, M_N, in the context of linear kinetic equations, with an emphasis on heterogeneous and large-scale computing platforms. Entropy-based closures are known in several cases to yield more accurate results than closures based on standard spectral approximations, such as P_N, but the […]

CUDA

Feb, 3

Deep Learning For Smile Recognition

Inspired by recent successes of deep learning in computer vision, we propose a novel application of deep convolutional neural networks to facial expression recognition, in particular smile recognition. A smile recognition test accuracy of 99.45% is achieved for the Denver Intensity of Spontaneous Facial Action (DISFA) database, significantly outperforming existing approaches based on hand-crafted features […]

CUDA

Feb, 3

Edge coloring in unstructured CFD codes

We propose a way of preventing race conditions in the evaluation of the surface integral contribution in discontinuous Galerkin and finite volume flow solvers by coloring the edges (or faces) of the computational mesh. In this work we use a partitioning algorithm that separates the edges of triangular elements into three groups and the faces […]

Feb, 3

A novel approach to evaluating compact finite differences and similar tridiagonal schemes on GPU-accelerated clusters

Compact finite difference schemes are widely used in the direct numerical simulation of fluid flows for their ability to better resolve the small scales of turbulence. However, they can be expensive to evaluate and difficult to parallelize. In this work, we present an approach for the computation of compact finite differences and similar tridiagonal schemes […]

CUDA

Feb, 3

Algorithms and Heuristics for Scalable Betweenness Centrality Computation on Multi-GPU Systems

Betweenness Centrality (BC) is steadily growing in popularity as a metrics of the influence of a vertex in a graph. The BC score of a vertex is proportional to the number of all-pairs-shortest-paths passing through it. However, complete and exact BC computation for a large-scale graph is an extraordinary challenge that requires high performance computing […]

CUDA

Feb, 2

International Conference on Biomedical Signal and Image Processing (ICBIP), 2016

ICBIP 2016 Shining Points: 1. Accepted and published papers can be indexed by Ei Compendex, Inspec, Scopus and other data base. 2. Prof. Dr. Bártfai Gyorgy from University of Szeged, Department of Obstetrics and Gynaecology, Hungary and Prof. Ioana Demetrescu from University Politehnica Bucharest, Romania have joined as Keynote Speakers. 3.One Day Vist and Tour […]

Feb, 2

6th International Conference on Circuits, System and Simulation (ICCSS), 2016

2016 6th International Conference on Circuits, System and Simulation(ICCSS 2016) will be held in Mexico City during August 16-18, 2016. It is technical sponsored by *National Autonomous University of Mexico (NAUM), Mexico*; Paper Publication All accepted papers must be written in English and will be published into #IEEE conference proceedings#, indexed by **Ei Compendex**. Conference […]

Feb, 2

2nd International Conf. on Robotics and Artificial Intelligence (ICRAI), 2016

Commitees Honorary Chairs Dr. Francisco E. Rivera, Federal Aviation Administration (FAA), USA International Advisory Committees Prof. Rory Mc Greal, Athabasca University, Canada Conference Co-Chairs Dr. Houssain Kettani, Fort Hays State University, USA Prof. Feng C. Lai, University of Oklahoma, USA Keynote Speakers Prof. Rory McGreal UNESCO/COL Chair in OER, Athabasca University, Canada Prof. Dr. Houssain […]

Feb, 2

2nd International Conf. on Bioinformatics and Computer Engineering (ICBCE), 2016

Feb, 2

Fast, Realistic Terrain Synthesis

The authoring of realistic terrain models is necessary to generate immersive virtual environments for computer games and film visual effects. However, creating these landscapes is difficult – it usually involves an artist spending many hours sculpting a model in a 3D design program. Specialised terrain generation programs exist to rapidly create artificial terrains, such as […]

CUDA

* * *

high performance computing on graphics processing units: hgpu.org

Posts

A Performance Analysis Framework for Optimizing OpenCL Applications on FPGAs

Workshop on Exascale Multi/Many Core Computing Systems, 2016

Optimization and Large Scale Computation of an Entropy-Based Moment Closure

Deep Learning For Smile Recognition

Edge coloring in unstructured CFD codes

A novel approach to evaluating compact finite differences and similar tridiagonal schemes on GPU-accelerated clusters

Algorithms and Heuristics for Scalable Betweenness Centrality Computation on Multi-GPU Systems

International Conference on Biomedical Signal and Image Processing (ICBIP), 2016

6th International Conference on Circuits, System and Simulation (ICCSS), 2016

2nd International Conf. on Robotics and Artificial Intelligence (ICRAI), 2016

2nd International Conf. on Bioinformatics and Computer Engineering (ICBCE), 2016

Fast, Realistic Terrain Synthesis

Recent source codes

Specx: Speculative task-based runtime system

Mutual-Supervised Learning for Sequential-to-Parallel Code Translation

KISim: Kubernetes Intelligent Scheduling Simulator

Hardware Compute Partitioning on NVIDIA GPUs for Composable Systems

Efficient GPU Implementation of Multi-Precision Integer Division

ParEval: A Parallel Code Evaluation Benchmark

FlashSparse: Minimizing Computation Redundancy for Fast Sparse Matrix Multiplications on Tensor Cores

exa-AMD: Exascale Accelerated Materials Discovery

WiLLM: An Open Wireless LLM Communication System

Vcc: the Vulkan Clang Compiler

Most viewed papers (last 30 days)