18666

Posts

Dec, 16

Performance Analysis of a Stereo Matching Implementation in OpenCL

Stereo matching is one of the first steps in the process of calculating 3D information from two 2D images. To triangulate a 3D point from two corresponding 2D features, the displacement in pixels, or the so-called disparity, must be estimated. From the estimated per-pixel disparity, using a projective camera model, 3D data for large portions […]
Dec, 16

Developing acquisition systems based on FPGA with OpenCL

Nuclear fusion is a phenomenon in which the nucleuses of hydrogen crash between them, causing helium atoms. The resulting nucleus is heavier than the hydrogen nucleuses, but is lighter than the addition of the masses of the nucleuses involved in the process. This phenomenon releases huge amounts of energy. The research group i2a2 develops the […]
Dec, 16

Delivering Performance-Portable Stencil Computations on CPUs and GPUs Using Bricks

Achieving high performance on stencil computations poses a number of challenges on modern architectures. The optimization strategy varies significantly across architectures, types of stencils, and types of applications. The standard approach to adapting stencil computations to different architectures, used by both compilers and application programmers, is through the use of iteration space tiling, whereby the […]
Dec, 16

SIMD-X: Programming and Processing of Graph Algorithms on GPUs

With high computation power and memory bandwidth, graphics processing units (GPUs) lend themselves to accelerate data-intensive analytics, especially when such applications fit the single instruction multiple data (SIMD) model. However, graph algorithms such as breadth-first search and k-core, often fail to take full advantage of GPUs, due to irregularity in memory access and control flow. […]
Dec, 12

3rd International Workshop on Theoretical Approaches to Performance Evaluation, Modeling and Simulation (TAPEMS), 2019

  The objective of the 3rd TAPEMS International Workshop on Theoretical Approaches to Performance Evaluation, Modeling and Simulation is to bring together researchers and practitioners from academia and industry to discuss current advances and trends in both theoretical and experimental approaches to the performance evaluation, modeling and analysis of parallel applications and algorithms on multicore […]
Dec, 9

Clacc: Translating OpenACC to OpenMP in Clang

OpenACC was launched in 2010 as a portable programming model for heterogeneous accelerators. Although various implementations already exist, no extensible, open-source, production-quality compiler support is available to the community. This deficiency poses a serious risk for HPC application developers targeting GPUs and other accelerators, and it limits experimentation and progress for the OpenACC specification. To […]
Dec, 9

High Performance Portable Tsunami Simulations on Many-core CPU, GPU, and FPGA

Tsunami generated by a submarine earthquake sometimes causes serious damage in a coastal area. To reduce negative effects of tsunami, effective evacuation and disaster prevention are getting interested. We can contribute to that by forecasting arrival time and height of tsunami with computer simulations. However, tsunami simulations always require massive data processing. The shallow water […]
Dec, 9

ASW: Accelerating Smith-Waterman Algorithm on Coupled CPU-GPU Architecture

Smith-waterman algorithm (SW) is a popular dynamic programming algorithm widely used in Bioinformatics for local biological sequence alignment. Due to the O(n2) high time and space complexity of SW and growing size of biological data, it is crucial to accelerate SW for high performance. In view of the GPU high efficiency in science computation, many […]
Dec, 9

Optimization of a discontinuous finite element solver with OpenCL and StarPU

schnaps is a finite element solver designed to simulate various physical phenomena. It is designed to run on hybrid computers made of several CPUs and GPUs. In order to address the hybrid architectures we rely on the StarPU runtime. StarPU allows to optimize in an incremental way a sequential algorithm in order to migrate to […]
Dec, 9

A Framework for Fast and Efficient Neural Network Compression

Network compression reduces the computational complexity and memory consumption of deep neural networks by reducing the number of parameters. In SVD-based network compression, the right rank needs to be decided for every layer of the network. In this paper, we propose an efficient method for obtaining the rank configuration of the whole network. Unlike previous […]
Dec, 5

International Conference on Frontiers of Neural Networks (ICFNN), 2019

Researchers, scientists, engineers and industry professionals will join together this year at ICFNN 2019, where the latest research will be unveiled and groundbreaking research projects will be presented. The field of  Frontiers of Neural Networks is entering an era of unprecedented change and innovation. ICFNN 2019 presents one of 2019’s premiere opportunities to hear from and network with […]
Dec, 5

International Conference on Frontiers of Artificial Intelligence and Machine Learning (FAIML), 2019

Researchers, scientists, engineers and industry professionals will join together this year at FAIML 2019, where the latest research will be unveiled and groundbreaking research projects will be presented. The field of Frontiers of Artificial Intelligence and Machine Learning is entering an era of unprecedented change and innovation. FAIML 2019 presents one of 2019’s premiere opportunities to hear from and […]

* * *

* * *

HGPU group © 2010-2019 hgpu.org

All rights belong to the respective authors

Contact us: