high performance computing on graphics processing units: hgpu.org

Posts

Mar, 22

MACC: An OpenACC Transpiler for Automatic Multi-GPU Use

Graphics Processing Units (GPUs) perform the majority of computations in state-of-the-art supercomputers. Programming these GPUs is often assisted using a programming model such as (amongst others) the directive-driven OpenACC. Unfortunately, OpenACC (and other similar models) are incapable of automatically targeting and distributing work across several GPUs, which decreases productivity and forces needless manual labor upon […]

Mar, 18

International Conference on Biomedicine & Pharmacotherapy, 2018

International Conference on Biomedicine & Pharmacotherapy is going to be held during August 06-07, 2018 in Osaka, Japan. The conferences focuses on foremost topics such as Biomedicine, Biomedical Statistics, Biomedical Diagnosis, Frontiers in Biomedicine, Industrial Pharmacy, Pharmacotherapy, Molecular Biomedicine, Computational Biomedicine, Tissue Engineering, Medical Devices, Biomedical Model, Personalized Medicine, Biomedical Technology, Nanotechnology, Pharmacotherapy, Pharmaceutical Sciences, […]

Mar, 18

8th International Workshop on Computer Science and Engineering (WCSE’18), 2018

Meeting time：June 28-30, 2018 Meeting place：1880 New Petchburi Road, Bangkok 10310 Thailand Organized by Science and Engineering Institute, co organized by Bauman Moscow State technical University, Russia, Tokyo University of Science, Japan and China Agricultural University, 2018 the 8th International Workshop on Computer Science and Engineering (WCSE 2018) to Bangkok, Thailand during June 28-30, 2018. […]

Mar, 18

International Conference on Robotics, Artificial Intelligence, Automation and Mechatronics, 2018

Mechatronics and Robotics 2018 warmly welcome all the researchers, developers, experts, students from the field of mechatronics & robotics to attend International Conference on Mechatronics & Robotics during October 15-16, 2018, Helsinki, Finland. The Conference will be composed around the theme “Unfolding Knowledge with a Delineate Technical World". Sessions covered as mentioned below: 1. Mechatronics […]

Mar, 18

The 3rd IEEE International Conference on Image, Vision and Computing (ICIVC), 2018

Meeting time: June 27-29, 2018 Meeting place: No.2 Chong Wen Road, Nan An District, Chongqing 400065, P. R. China Keynote speakers Prof. Lap-Pui Chau, Nanyang Technological University, Singapore (IEEE Fellow) Prof. Shahram Latifi, UNIVERSITY OF NEVADA, USA (IEEE Fellow) Published by: All accepted papers must be written in English and will be published into conference proceedings and indexed […]

Mar, 18

The 2th International Conference on Robotics and Automation Sciences (ICRAS), 2018

Meeting time: June 23-25, 2018 Meeting place: Room 301, 2nd Floor, No.2 Teaching Building, West Area of the campus, No. 388 Lumo Road, Wuhan, P.R. China Keynote speakers Prof. Mengchu Zhou IEEE Fellow, IFAC Fellow, AAAS Fellow New Jersey Institute of Technology, USA Prof. Zhang Dan, PhD, PEng., FCAE, FEIC, FASME, FCSME, SMIEEE Kaneff Research […]

Mar, 17

HPVM: Heterogeneous Parallel Virtual Machine

We propose a parallel program representation for heterogeneous systems, designed to enable performance portability across a wide range of popular parallel hardware, including GPUs, vector instruction sets, multicore CPUs and potentially FPGAs. Our representation, which we call HPVM, is a hierarchical dataflow graph with shared memory and vector instructions. HPVM supports three important capabilities for […]

OpenCL

Mar, 17

Toolflows for Mapping Convolutional Neural Networks on FPGAs: A Survey and Future Directions

In the past decade, Convolutional Neural Networks (CNNs) have demonstrated state-of-the-art performance in various Artificial Intelligence tasks. To accelerate the experimentation and development of CNNs, several software frameworks have been released, primarily targeting power-hungry CPUs and GPUs. In this context, reconfigurable hardware in the form of FPGAs constitutes a potential alternative platform that can be […]

OpenCL

Mar, 17

CuLDA_CGS: Solving Large-scale LDA Problems on GPUs

Latent Dirichlet Allocation(LDA) is a popular topic model. Given the fact that the input corpus of LDA algorithms consists of millions to billions of tokens, the LDA training process is very time-consuming, which may prevent the usage of LDA in many scenarios, e.g., online service. GPUs have benefited modern machine learning algorithms and big data […]

CUDA

Mar, 17

Improved OpenCL-based Implementation of Social Field Pedestrian Model

Two aspects of improvements are proposed for the OpenCL-based implementation of the social field pedestrian model. In the aspect of algorithm, a method based on the idea of divide-and-conquer is devised in order to overcome the problem of global memory depletion when fields are of a larger size. This is of importance for the study […]

OpenCL

Mar, 17

NVIDIA Tensor Core Programmability, Performance & Precision

The NVIDIA Volta GPU microarchitecture introduces a specialized unit, called "Tensor Core" that performs one matrix-multiply-and-accumulate on 4×4 matrices per clock cycle. The NVIDIA Tesla V100 accelerator, featuring the Volta microarchitecture, provides 640 Tensor Cores with a theoretical peak performance of 125 Tflops/s in mixed precision. In this paper, we investigate current approaches to program […]

CUDA

Mar, 10

Portable Real-Time DCT Based Steganography Using OpenCL

In this paper a steganographic method for real time data hiding is proposed. The main goal of the research is to develop steganographic method with increased robustness to unintentional image processing attacks. In addition, we prove the validity of the method in real time applications. The method is based on a discrete cosine transform (DCT) […]

OpenCL