15627

Posts

Mar, 25

Accelerating Deep Neural Network Training with Inconsistent Stochastic Gradient Descent

SGD is the widely adopted method to train CNN. Conceptually it approximates the population with a randomly sampled batch; then it evenly trains batches by conducting a gradient update on every batch in an epoch. In this paper, we demonstrate Sampling Bias, Intrinsic Image Difference and Fixed Cycle Pseudo Random Sampling differentiate batches in training, […]
Mar, 25

An Efficient Implementation of the Longest Common Subsequence Algorithm with Bit-Parallelism on GPUs

The longest common subsequence (LCS) for two given strings has various applications, such as for the comparison of deoxyribonucleic acid (DNA). In this thesis, we propose a graphics processing unit (GPU) algorithm to accelerate Hirschberg’s LCS algorithm improved with the bit-parallel algorithm by Crochemore et al. The algorithm by Crochemore et al. includes bitwise logical […]
Mar, 25

A mixed precision semi-Lagrangian algorithm and its performance on accelerators

In this paper we propose a mixed precision algorithm in the context of the semi-Lagrangian discontinuous Galerkin method. The performance of this approach is evaluated on a traditional dual socket workstation as well as on a Xeon Phi and an NVIDIA K80. We find that the mixed precision algorithm can be implemented efficiently on these […]
Mar, 25

A Survey of Recent Prefetching Techniques for Processor Caches

As the trends of process scaling make memory system even more crucial bottleneck, the importance of latency hiding techniques such as prefetching grows further. However, naively using prefetching can harm performance and energy efficiency and hence, several factors and parameters need to be taken into account to fully realize its potential. In this paper, we […]
Mar, 22

The First International Workshop on GPU Computing and Applications (GCA), 2016

Built for massive parallelism, General Purpose computing on Graphic Processing Unit (GPGPU) has superseded high-performance CPU in a number of important tasks, including computer graphics, physics calculations, encryption/decryption and scientific computations. The goal of this workshop is to provide a forum to discuss and evaluate emerging techniques, platforms and applications capable of harvesting the power […]
Mar, 22

Comparison of Technologies for General-Purpose Computing on Graphics Processing Units

The computational capacity of graphics cards for general-purpose computing have progressed fast over the last decade. A major reason is computational heavy computer games, where standard of performance and high quality graphics constantly rise. Another reason is better suitable technologies for programming the graphics cards. Combined, the product is high raw performance devices and means […]
Mar, 22

Proteus: Efficient Resource Use in Heterogeneous Architectures

Current processors provide a variety of different processing units to improve performance and power efficiency. For example, ARM’S big.LITTLE, AMD’s APUs, and Oracle’s M7 provide heterogeneous processors, on-die GPUs, and ondie accelerators. However, the performance experienced by programs on these accelerators can be highly variable due to issues like contention from multiprogramming or thermal constraints. […]
Mar, 22

Recurrent neural networks for language modeling

The goal of the thesis is to explore the mechanisms and tools that enables efficient development of Recurrent Neural Networks, how to train them and what they can accomplish in regard to character level language modelling. Specifically Gated Recurrence Units and Long Short Term Memory are the focal point of the training and language modelling. […]
Mar, 22

A Survey of Techniques for Architecting and Managing GPU Register File

To support their massively-multithreaded architecture, GPUs use very large register file (RF) which has a capacity higher than even L1 and L2 caches. In total contrast, traditional CPUs use tiny RF and much larger caches to optimize latency. Due to these differences, along with the crucial impact of RF in determining GPU performance, novel and […]
Mar, 20

OpenCL Cryptographic Library

Modern GPUs are devices with very high parallelism for a very low cost. Integer and logic instruction support enable us to use them for many workloads unrelated to rendering. Cryptographic algorithms like AES or Blowfish can benefit from being executed on the system’s GPU. Such execution off-loads work from the main CPU, freeing it to […]
Mar, 20

Automatic Detection and Denoising of Signals in Large Geophysical Datasets

To fully understand the complex interactions of various phenomena in the natural world, scientific disciplines such as geology and seismology increasingly rely upon analyzing large amounts of observations. However, data collection is growing at a faster rate than what is currently possible to analyze through traditional approaches. These datasets, supplied by the increasing use of […]
Mar, 20

Acceleration of ensemble machine learning methods using many-core devices

We present a case study into the acceleration of ensemble machine learning methods using many-core devices in collaboration with Toshiba Medical Visualisation Systems Europe (TMVSE). The adoption of GPUs to execute a key algorithm in the classification of medical image data was shown to significantly reduce overall processing time. Using a representative dataset and pre-trained […]
Page 30 of 889« First...1020...2829303132...405060...Last »

* * *

* * *

TwitterAPIExchange Object
(
    [oauth_access_token:TwitterAPIExchange:private] => 301967669-yDz6MrfyJFFsH1DVvrw5Xb9phx2d0DSOFuLehBGh
    [oauth_access_token_secret:TwitterAPIExchange:private] => o29ji3VLVmB6jASMqY8G7QZDCrdFmoTvCDNNUlb7s
    [consumer_key:TwitterAPIExchange:private] => TdQb63pho0ak9VevwMWpEgXAE
    [consumer_secret:TwitterAPIExchange:private] => Uq4rWz7nUnH1y6ab6uQ9xMk0KLcDrmckneEMdlq6G5E0jlQCFx
    [postfields:TwitterAPIExchange:private] => 
    [getfield:TwitterAPIExchange:private] => ?cursor=-1&screen_name=hgpu&skip_status=true&include_user_entities=false
    [oauth:protected] => Array
        (
            [oauth_consumer_key] => TdQb63pho0ak9VevwMWpEgXAE
            [oauth_nonce] => 1475100832
            [oauth_signature_method] => HMAC-SHA1
            [oauth_token] => 301967669-yDz6MrfyJFFsH1DVvrw5Xb9phx2d0DSOFuLehBGh
            [oauth_timestamp] => 1475100832
            [oauth_version] => 1.0
            [cursor] => -1
            [screen_name] => hgpu
            [skip_status] => true
            [include_user_entities] => false
            [oauth_signature] => fk/Up+jZhSBDWaPsWvkAHPC7UYs=
        )

    [url] => https://api.twitter.com/1.1/users/show.json
)
Follow us on Facebook
Follow us on Twitter

HGPU group

2001 peoples are following HGPU @twitter

HGPU group © 2010-2016 hgpu.org

All rights belong to the respective authors

Contact us: