16617

Posts

Oct, 12

Overtaking CPU DBMSes with a GPU in Whole-Query Analytic Processing with Parallelism-Friendly Execution Plan Optimization

Existing work on accelerating analytic DB query processing with (discrete) GPUs fails to fully realize their potential for speedup through parallelism: Published results do not achieve significant speedup over more performant CPU-only DBMSes when processing complete queries. This paper presents a successful e!ort to better meet this challenge, in the form of a proof-of-concept query […]
Oct, 12

Understanding Latency Hiding on GPUs

Modern commodity processors such as GPUs may execute up to about a thousand of physical threads per chip to better utilize their numerous execution units and hide execution latencies. Understanding this novel capability, however, is hindered by the overall complexity of the hardware and complexity of typical workloads. In this dissertation, we suggest a better […]
Oct, 12

Neural Network Computing Using On-Chip Accelerators

The use of neural networks, machine learning, or artificial intelligence, in its broadest and most controversial sense, has been a tumultuous journey involving three distinct hype cycles and a history dating back to the 1960s. Resurgent, enthusiastic interest in machine learning and its applications bolsters the case for machine learning as a fundamental computational kernel. […]
Oct, 12

Portage: Bringing Hackers’ Wisdom to Science

Providing users of HPC systems with a wide variety of up to date software packages is a challenging task. Large software stacks built from source are difficult to manage, requiring powerful package management tools. The Portage package manager from Gentoo is a highly flexible tool that offers a mature solution to this otherwise daunting task. […]
Oct, 12

SaberLDA: Sparsity-Aware Learning of Topic Models on GPUs

Latent Dirichlet Allocation (LDA) is a popular tool for analyzing discrete count data such as text and images. Applications require LDA to handle both large datasets and a large number of topics. Though distributed CPU systems have been used, GPU-based systems have emerged as a promising alternative because of the high computational power and memory […]
Oct, 9

International Conference on Digital Signal Processing (ICDSP), 2017

For papers submitted to ICDSP 2017, we offer the publications as following: 1. Publication in proceedings, which will be indexed by EI Compendex, Scopus, and ISI CPCS. 2. Publication published in the International Journal of Signal Processing Systems, which will be indexed by EI (INSPEC, IET), Google Scholar, etc There are two methods for submitting […]
Oct, 9

6th International Conference on Frontiers of Information Technology (ICFIT), 2017

For papers submitted to ICFIT 2017, we offer the publications as following: 1. Publication in Proceedings. Submissions will be peer reviewed by conference committees, and accepted papers will be published in proceedings, which will be indexed by EI Compendex, Scopus, and ISI CPCS. 2. Publication in Journal. Submissions will be reviewed by the conference committees […]
Oct, 9

6th International Conference on Software and Computing Technologies (ICSCT), 2017

For papers submitted to ICSCT 2017, we offer the publications as following: 1. Publication in Proceedings. Submissions will be peer reviewed by conference committees, and accepted papers will be published in proceedings, which will be indexed by EI Compendex, Scopus, and ISI CPCS. 2. Publication in Journal. Submissions will be reviewed by the conference committees […]
Oct, 9

2nd IEEE International Conference on Signal and Image Processing (ICSIP), 2017

1.Publication: After a careful reviewing process, all accepted papers after proper registration and presentation, will be published in the conference Proceedings by IEEE, and sent to be reviewed by the IEEE Conference Publication Program for IEEE Xplore and Ei Compendex. 2.Submission Methods: Electronic Submission System (.pdf) https://www.easychair.org/conferences/?conf=icsip2017
Oct, 8

Implementation of Frequency Domain Convolution for the Caffe-Framework

Deep Convolutional Neural Networks have received a lot of attention over the past few years as a promising technique for object classification in images. In this thesis, we implemented the frequency domain convolution for the popular Caffe framework. Deep Convolutional Neural Networks suffer from long training times even on contemporary hardware, which we want to […]
Oct, 8

GPU Concurrency Choices in Graph Analytics

Graph analytics is becoming ever more ubiquitous in today’s world. However, situational dynamic changes in input graphs, such as changes in traffic and weather patterns, lead to variations in concurrency. Moreover, graph algorithms are known to have data dependent loops and fine-grain synchronization that makes them hard to scale on parallel machines. Recent trends in […]
Oct, 8

BioEM: GPU-accelerated computing of Bayesian inference of electron microscopy images

In cryo-electron microscopy (EM), molecular structures are determined from large numbers of projection images of individual particles. To harness the full power of this single-molecule information, we use the Bayesian inference of EM (BioEM) formalism. By ranking structural models using posterior probabilities calculated for individual images, BioEM in principle addresses the challenge of working with […]

Recent source codes

* * *

* * *

HGPU group © 2010-2025 hgpu.org

All rights belong to the respective authors

Contact us:

contact@hpgu.org