Nov, 5

Molecular Activity Prediction using Deep Learning Software Library

In order to know how work deep learning method in chemoinformatics and bioinformatics problems, we have attempted to predict the molecular activities using the molecular fingerprints (chemical descriptor vectors) provided by the "Merck molecular activity challenge" competition and an open source deep learning library Chainer. Our result has been able to reproduce almost identical increase-decrease […]
Nov, 5

A Memory Bandwidth-Efficient Hybrid Radix Sort on GPUs

Sorting is at the core of many database operations, such as index creation, sort-merge joins and user-requested output sorting. As GPUs are emerging as a promising platform to accelerate various operations, sorting on GPUs becomes a viable endeavour. Over the past few years, several improvements have been proposed for sorting on GPUs, leading to the […]
Nov, 5

grim: A Flexible, Conservative Scheme for Relativistic Fluid Theories

Hot, diffuse, relativistic plasmas such as sub-Eddington black hole accretion flows are expected to be collisionless, yet are commonly modeled as a fluid using ideal general relativistic magnetohydrodynamics (GRMHD). Dissipative effects such as heat conduction and viscosity can be important in a collisionless plasma and will potentially alter the dynamics and radiative properties of the […]
Nov, 3

Extensions and Limitations of the Neural GPU

The Neural GPU is a recent model that can learn algorithms such as multi-digit binary addition and binary multiplication in a way that generalizes to inputs of arbitrary length. We show that there are two simple ways of improving the performance of the Neural GPU: by carefully designing a curriculum, and by increasing model size. […]
Nov, 3

Diplomat: Mapping of multi-kernel applications using a static dataflow abstraction

In this paper we propose a novel approach to heterogeneous embedded systems programmability using a taskgraph based framework called Diplomat. Diplomat is a taskgraph framework that exploits the potential of static dataflow modeling and analysis to deliver performance estimation and CPU/GPU mapping. An application has to be specified once, and then the framework can automatically […]
Nov, 3

MILC staggered conjugate gradient performance on Intel KNL

We review our work done to optimize the staggered conjugate gradient (CG) algorithm in the MILC code for use with the Intel Knights Landing (KNL) architecture. KNL is the second generation Intel Xeon Phi processor. It is capable of massive thread parallelism, data parallelism, and high on-board memory bandwidth and is being adopted in supercomputing […]
Nov, 3

A hybrid algorithm for parallel molecular dynamics simulations

This article describes an algorithm for hybrid parallelization and SIMD vectorization of molecular dynamics simulations with short-ranged forces. The parallelization method combines domain decomposition with a thread-based parallelization approach. The goal of the work is to enable efficient simulations of very large (tens of millions of atoms) and inhomogeneous systems on many-core processors with hundreds […]
Nov, 3

Hybrid CPU-GPU generation of the Hamiltonian and Overlap matrices in FLAPW methods

In this paper we focus on the integration of high-performance numerical libraries in ab initio codes and the portability of performance and scalability. The target of our work is FLEUR, a software for electronic structure calculations developed in the Forschungszentrum J"ulich over the course of two decades. The presented work follows up on a previous […]
Nov, 2

A Survey of Techniques for Architecting TLBs

Translation lookaside buffer (TLB) caches virtual to physical address translation information and is used in systems ranging from embedded devices to high-end servers. Since TLB is accessed very frequently and a TLB miss is extremely costly, prudent management of TLB is important for improving performance and energy efficiency of processors. In this paper, we present […]
Nov, 1

Classification Performance of Convolutional Neural Networks

The purpose of this thesis is to determine the performance of convolutional neural networks in classifications per millisecond, not training or accuracy, for the GTX960 and the TegraX1. This is done through varying parameters of the convolutional neural networks and using the Python framework Theano’s function profiler to measure the time taken for different networks. […]
Nov, 1

Programming Heterogeneous Systems from an Image Processing DSL

Specialized image processing accelerators are necessary to deliver the performance and energy efficiency required by important applications in computer vision, computational photography, and augmented reality. But creating, "programming,"and integrating this hardware into a hardware/software system is difficult. We address this problem by extending the image processing language, Halide, so users can specify which portions of […]
Nov, 1

LightRNN: Memory and Computation-Efficient Recurrent Neural Networks

Recurrent neural networks (RNNs) have achieved state-of-the-art performances in many natural language processing tasks, such as language modeling and machine translation. However, when the vocabulary is large, the RNN model will become very big (e.g., possibly beyond the memory capacity of a GPU device) and its training will become very inefficient. In this work, we […]
Page 11 of 904« First...910111213...203040...Last »

* * *

* * *

TwitterAPIExchange Object
    [oauth_access_token:TwitterAPIExchange:private] => 301967669-yDz6MrfyJFFsH1DVvrw5Xb9phx2d0DSOFuLehBGh
    [oauth_access_token_secret:TwitterAPIExchange:private] => o29ji3VLVmB6jASMqY8G7QZDCrdFmoTvCDNNUlb7s
    [consumer_key:TwitterAPIExchange:private] => TdQb63pho0ak9VevwMWpEgXAE
    [consumer_secret:TwitterAPIExchange:private] => Uq4rWz7nUnH1y6ab6uQ9xMk0KLcDrmckneEMdlq6G5E0jlQCFx
    [postfields:TwitterAPIExchange:private] => 
    [getfield:TwitterAPIExchange:private] => ?cursor=-1&screen_name=hgpu&skip_status=true&include_user_entities=false
    [oauth:protected] => Array
            [oauth_consumer_key] => TdQb63pho0ak9VevwMWpEgXAE
            [oauth_nonce] => 1484688974
            [oauth_signature_method] => HMAC-SHA1
            [oauth_token] => 301967669-yDz6MrfyJFFsH1DVvrw5Xb9phx2d0DSOFuLehBGh
            [oauth_timestamp] => 1484688974
            [oauth_version] => 1.0
            [cursor] => -1
            [screen_name] => hgpu
            [skip_status] => true
            [include_user_entities] => false
            [oauth_signature] => O/deVswsn97aITD1z3/PRVvwL5Y=

    [url] => https://api.twitter.com/1.1/users/show.json
Follow us on Facebook
Follow us on Twitter

HGPU group

2129 peoples are following HGPU @twitter

HGPU group © 2010-2017 hgpu.org

All rights belong to the respective authors

Contact us: