17389

Posts

Jul, 19

GPU LSM: A Dynamic Dictionary Data Structure for the GPU

We develop and implement a concurrent dictionary data structure based on the Log Structured Merge tree (LSM), suitable for current massively parallel GPU architectures. Our GPU LSM is dynamic (mutable) in that it provides fast updates (insertions and deletions). For example, on an NVIDIA K40c GPU we can get an average update rate of 225 […]
Jul, 19

Batched QR and SVD Algorithms on GPUs with Applications in Hierarchical Matrix Compression

We present high performance implementations of the QR and the singular value decomposition of a batch of small matrices hosted on the GPU with applications in the compression of hierarchical matrices. The one-sided Jacobi algorithm is used for its simplicity and inherent parallelism as a building block for the SVD of low rank blocks using […]
Jul, 19

Fine-Grain Acceleration of Graph Algorithms on a Heterogeneous Chip

With the rise of heterogeneous chips available in the market, where integrated GPU cores and CPU cores reside in the same chip and share a unified memory, it is possible to have better execution schemes for many graph algorithms. Graph algorithms can exhibit producer-consumer behavior, a varying amount of parallelism during execution, and irregularity which […]
Jul, 15

International Conference on Image and Graphics Processing (ICIGP), 2018

Aiming at bringing together researchers and practitioners from academia and industry to exchange technical knowledge and boost technical and educational collaboration activities within the related fields of image and graphics processing, 2018 International Conference on Image and Graphics Processing (ICIGP 2018) will be held in Hong Kong during February 24-26, 2018. It is sponsored by […]
Jul, 15

8th International Conference on Bioscience, Biochemistry and Bioinformatics (ICBBB), 2018

ICBBB conference series held annually to provide an interactive forum for presentation and discussion on Bioscience, Biochemistry and Bioinformatics. The conference welcomes participants from all over the world who are interested in developing professional ties to and/or exploring career opportunities in the region. The conference should serve as an ideal forum to establish relationships from […]
Jul, 15

The 10th International Conference on Computer Modeling and Simulation (ICCMS), 2018

The 10th International Conference on Computer Modeling and Simulation is the main annual research conference aims to bring together researchers around the world to exchange research results and address open issues in all aspects of Computer Modeling and Simulation. The ICCMS 2009-2017 were held in Macau, Sanya, Mumbai, Hong Kong, Rome, Barcelona, Amsterdam, Brisbane, and […]
Jul, 15

4th International Conference on Virtual Reality (ICVR), 2018

2018 4th International Conference on Virtual Reality (ICVR 2018) will be held during February 24-26, 2018 in Hong Kong. ICVR 2018 will bring together top researchers from Asian Pacific areas, North America, Europe and all around the world to exchange research results and address open issues in all aspects of Virtual Reality. Publication Accepted papers […]
Jul, 15

10th International Conference on Machine Learning and Computing (ICMLC), 2018

The ICMLC 2018: 2018 10th International Conference on Machine Learning and Computing aims to bring together leading academic scientists, researchers and research scholars to exchange and share their experiences and research results on all aspects of Machine Learning and Computing. It also provides a premier interdisciplinary platform for researchers, practitioners and educators to present and […]
Jul, 14

Cooperative Kernels: GPU Multitasking for Blocking Algorithms

There is growing interest in accelerating irregular data-parallel algorithms on GPUs. These algorithms are typically blocking, so they require fair scheduling. But GPU programming models (e.g. OpenCL) do not mandate fair scheduling, and GPU schedulers are unfair in practice. Current approaches avoid this issue by exploiting scheduling quirks of today’s GPUs in a manner that […]
Jul, 14

Multikernel Data Partitioning With Channel on OpenCL-Based FPGAs

Recently, field-programmable gate array (FPGA) vendors (such as Altera) have started to address the programmability issues of FPGAs via OpenCL SDKs. In this paper, we analyze the performance of relational database applications on FPGAs using OpenCL. In particular, we study how to improve the performance of data partitioning, which is a very important building block […]
Jul, 14

Benchmarking Data Analysis and Machine Learning Applications on the Intel KNL Many-Core Processor

Knights Landing (KNL) is the code name for the second-generation Intel Xeon Phi product family. KNL has generated significant interest in the data analysis and machine learning communities because its new many-core architecture targets both of these workloads. The KNL many-core vector processor design enables it to exploit much higher levels of parallelism. At the […]
Jul, 14

A Similarity Measure for GPU Kernel Subgraph Matching

Accelerator architectures specialize in executing SIMD (single instruction, multiple data) in lockstep. Because the majority of CUDA applications are parallelized loops, control flow information can provide an in-depth characterization of a kernel. CUDAflow is a tool that statically separates CUDA binaries into basic block regions and dynamically measures instruction and basic block frequencies. CUDAflow captures […]

* * *

* * *

HGPU group © 2010-2024 hgpu.org

All rights belong to the respective authors

Contact us: