18265

Posts

Jun, 9

Fast Locality Sensitive Hashing for Beam Search on GPU

We present a GPU-based Locality Sensitive Hashing (LSH) algorithm to speed up beam search for sequence models. We utilize the winner-take-all (WTA) hash, which is based on relative ranking order of hidden dimensions and thus resilient to perturbations in numerical values. Our algorithm is designed by fully considering the underling architecture of CUDA-enabled GPUs (Algorithm/Architecture […]
Jun, 9

Deep Fluids: A Generative Network for Parameterized Fluid Simulations

This paper presents a novel generative model to synthesize fluid simulations from a set of reduced parameters. A convolutional neural network is trained on a collection of discrete, parameterizable fluid simulation velocity fields. Due to the capability of deep learning architectures to learn representative features of the data, our generative model is able to accurately […]
Jun, 5

The Third International Workshop on GPU Computing and AI (GCA), 2018

==================================================== The Third International Workshop on GPU Computing and AI (GCA) http://is-candar.org/GCA18/ to be held in conjunction with The Sixth International Symposium on Computing and Networking (CANDAR’18), Hida Takayama, Japan, November 27-30, 2018 http://is-candar.org/ ==================================================== [Introduction] Built for massive parallelism, General Purpose computing on Graphic Processing Unit (GPGPU) has superseded high-performance CPU in several important […]
Jun, 5

The 5th International Conference on Power and Energy Systems Engineering (CPESE), 2018

Meeting time: September 19-21, 2018 Meeting place: Nagoya University, Japan keynote speakers Prof. Tony C.Y. Chung – Fellow of IEEE University of Saskatchewan, Canada Prof. Hassan Bevrani – University of Kurdistan, Iran Published by All accepted papers after proper registration and presentation, will be published in the CPESE 2018 conference Proceedings. Important dates Paper Submission: […]
Jun, 5

The 10th International Conference on Information Management and Engineering (ICIME), 2018

Meeting time: September 22-24, 2018 Meeting place: MediaCityUK, Salford Quays, Greater Manchester, England keynote speakers Prof. Sunil Vadera – University of Salford, UK. Prof. Marat Akhmet – Middle East Technical University, Turkey. Published by All the registered and presented papers will published in the International Conference Proceedings Series by ACM, which will be archived in […]
Jun, 5

The 4th International Conference on Control Science and Systems Engineering (ICCSSE), 2018

Meeting time: August 21-23, 2018. Meeting place: Huazhong University of Science and Technology of China. No. 1037, Luoyu Road, Hongshan District, Wuhan, China. Published by: Selected and registered papers to be published by IEEE Conference Publication. After a careful reviewing process, all accepted papers after proper registration and presentation, will be published in the conference […]
Jun, 5

The 2018 International Conference on Cloud Computing and Internet of Things (CCIOT’18), 2018

Meeting time: October 29-31, 2018. Meeting place: Nanyang Executive Centre in Nanyang Technological University, Singapore Host unit: ACM Singapore Chapter. keynote speaker Prof. Latif Ladid, University of Luxembourg, Luxembourg. Prof. Dimitrios Georgakopoulos, Swinburne University of Technology, Australia. Published by: Accepted papers will be published into conference proceedings which is indexed by EI Compendex, Scopus, Thomson […]
Jun, 2

clMF: A fine-grained and portable alternating least squares algorithm for parallel matrix factorization

Alternating least squares (ALS) has been proved to be an effective solver for matrix factorization in recommender systems. To speed up factorizing performance, various parallel ALS solvers have been proposed to leverage modern multi-cores and many-cores. Existing implementations are limited in either speed or portability. In this paper, we present an efficient and portable ALS […]
Jun, 2

Design of FPGA-Based Accelerator for Convolutional Neural Network under Heterogeneous Computing Framework with OpenCL

CPU has insufficient resources to satisfy the efficient computation of the Convolution Neural Network (CNN), especially for embedded applications. Therefore, heterogeneous computing platforms are widely used to accelerate CNN tasks, such as GPU, FPGA and ASIC. Among these, FPGA can accelerate the computation by mapping the algorithm to the parallel hardware instead of CPU, which […]
Jun, 2

NengoDL: Combining deep learning and neuromorphic modelling methods

NengoDL is a software framework designed to combine the strengths of neuromorphic modelling and deep learning. NengoDL allows users to construct biologically detailed neural models, intermix those models with deep learning elements (such as convolutional networks), and then efficiently simulate those models in an easy-to-use, unified framework. In addition, NengoDL allows users to apply deep […]
Jun, 2

Marian: Cost-effective High-Quality Neural Machine Translation in C++

This paper describes the submissions of the "Marian" team to the WNMT 2018 shared task. We investigate combinations of teacher-student training, low-precision matrix products, auto-tuning and other methods to optimize the Transformer model on GPU and CPU. By further integrating these methods with the new averaging attention networks, a recently introduced faster Transformer variant, we […]
Jun, 2

FPGA-based Acceleration of FT Convolution for Pulsar Search Using OpenCL

The Square Kilometre Array (SKA) project will be the world largest radio telescope array. With its large number of antennas, the number of signals that need to be processed is dramatic. One important element of the SKA’s Central Signal Processor package is pulsar search. This paper focuses on the FPGA-based acceleration of the Frequency-Domain Acceleration […]

Recent source codes

* * *

* * *

HGPU group © 2010-2025 hgpu.org

All rights belong to the respective authors

Contact us:

contact@hpgu.org