17842

Posts

Nov, 30

Implementing implicit OpenMP data sharing on GPUs

OpenMP is a shared memory programming model which supports the offloading of target regions to accelerators such as NVIDIA GPUs. The implementation in Clang/LLVM aims to deliver a generic GPU compilation toolchain that supports both the native CUDA C/C++ and the OpenMP device offloading models. There are situations where the semantics of OpenMP and those […]
Nov, 30

2D Image Convolution using Three Parallel Programming Models on the Xeon Phi

Image convolution is widely used for sharpening, blurring and edge detection. In this paper, we review two common algorithms for convolving a 2D image by a separable kernel (filter). After optimising the naive codes using loop unrolling and SIMD vectorisation, we choose the algorithm with better performance as the baseline for parallelisation. We then compare […]
Nov, 30

fpgaConvNet: A Toolflow for Mapping Diverse Convolutional Neural Networks on Embedded FPGAs

In recent years, Convolutional Neural Networks (ConvNets) have become an enabling technology for a wide range of novel embedded Artificial Intelligence systems. Across the range of applications, the performance needs vary significantly, from high-throughput video surveillance to the very low-latency requirements of autonomous cars. In this context, FPGAs can provide a potential platform that can […]
Nov, 30

Intel FPGA SDK for OpenCL

The Intel FPGA SDK for OpenCL Programming Guide provides descriptions, recommendations and usage information on the Intel Software Development Kit (SDK) for OpenCL compiler and tools. The Intel FPGA SDK for OpenCL is an OpenCL-based heterogeneous parallel programming environment for Intel FPGA products.
Nov, 28

International Conference on Computing and Pattern Recognition (ICCPR), 2018

Computing and Pattern Recognition contribute greatly in the advancement of technologies. Rapid technological advancements can be achieved through continuous research. Hong Kong Chemical, Biological & Environmental Engineering Society (HKCBEES) and Biology and Bioinformatics Society (BBS) is going to organize the 2018 International Conference on Computing and Pattern Recognition (ICCPR 2018) during June 23-25, 2018 in […]
Nov, 28

7th International Conference on Bioinformatics and Biomedical Science (ICBBS), 2018

7th International Conference on Bioinformatics and Biomedical Science (ICBBS 2018) will be held in Harbin Institute of Technology, Shenzhen, China during June 23-25, 2018. ICBBS conference series is held annually. Previously, it has been successfully held in Bangkok, Kuala Lumpur, Copenhagen, Bali and Singapore. The aim of the ICBBS is to provide a platform for […]
Nov, 28

2nd International Conference on Medical and Health Informatics (ICMHI), 2018

We are pleased to announce that the second International Conference on Medical and Health Informatics 2018 (ICMHI 2018) will be held in Tsukuba, Japan during June 8-10, 2018. The ICMHI 2018 aims to bring together outstanding scholars, researchers, and students to exchange and share their experiences and research results about all aspects of Medical and […]
Nov, 28

International Conference on Virtual Reality Technology (ICVRT), 2018

2018 International Conference on Virtual Reality Technology (ICVRT 2018) will be held in Singapore during June 15-18, 2018. It is sponsored by International Association of Computer Science and Information Technology. With an objective to serve as an annual platform for students, academics and industry researchers in the fields of Virtual Reality to exchange face to […]
Nov, 28

International Conference on Advances in Image Processing (ICAIP), 2018

The International Conference on Advances in Image Processing (ICAIP) is mainly organised by International Academy of Computing Technology (IACT). The aim of ICAIP is to become an international forum where researchers, scientists, scholars and students can share their experiences, ideas and research results related to Advances in Image Processing. Publication All accepted and presented papers […]
Nov, 26

Performance Modeling and Evaluation of Distributed Deep Learning Frameworks on GPUs

Deep learning frameworks have been widely deployed on GPU servers for deep learning applications in both academia and industry. In the training of deep neural networks (DNNs), there are many standard processes or algorithms, such as convolution and stochastic gradient descent (SGD), but the running performance of different frameworks might be different even running the […]
Nov, 26

High Performance Streaming Smith-Waterman Implementation with Implicit Synchronization on Intel FPGA using OpenCL

The Smith-Waterman algorithm is widely used in bioinformatics and is often used as a benchmark of FPGA performance. Here we present our highly optimized SmithWaterman implementation on Intel FPGAs using OpenCL. Our implementation is both faster and more efficient than other current Smith-Waterman implementations, obtaining a theoretical performance of 214 GCUPS. Moreover, due to the […]
Nov, 26

Computing the distance between two finite element solutions defined on different 3D meshes on a GPU

This article introduces a new method to efficiently compute the distance (i.e., L^p norm of the difference) between two functions supported by two different meshes of the same 3D domain. The functions that we consider are typically finite element solutions discretized in different function spaces supported by meshes that are potentially completely unrelated. Our method […]

* * *

* * *

HGPU group © 2010-2025 hgpu.org

All rights belong to the respective authors

Contact us:

contact@hpgu.org