15696

Posts

Apr, 12

CUED-RNNLM – An Open-Source Toolkit for Efficient Training and Evaluation of Recurrent Neural Network Language Models

In recent years, recurrent neural network language models (RNNLMs) have become increasingly popular for a range of applications including speech recognition. However, the training of RNNLMs is computationally expensive, which limits the quantity of data, and size of network, that can be used. In order to fully exploit the power of RNNLMs, efficient training implementations […]
Apr, 12

Efficient Parallel Implementation for Single Block Orthogonal Dictionary Learning

Dictionary training for sparse representations involves dealing with large chunks of data and complex algorithms that determine time consuming tasks. In this paper we propose an improved parallel version for the single block orthogonal dictionary learning algorithm that reduces the representation error and improves the execution time. Our solution targets OpenCL capable graphical device units […]
Apr, 12

Portable and Transparent Software Managed Scheduling on Accelerators for Fair Resource Sharing

Accelerators, such as Graphic Processing Units (GPUs), are popular components of modern parallel systems. Their energy-efficient performance make them attractive components for modern data center nodes. However, they lack control for fair resource sharing amongst multiple users. This paper presents a runtime and Just In Time compiler that enables resource sharing control and software managed […]
Apr, 12

Algorithmic and Software System Support to Accelerate Data Processing in CPU-GPU Hybrid Computing Environments

Massively data-parallel processors, Graphics Processing Units (GPUs) in particular, have recently entered the main stream of general-purpose computing as powerful hardware accelerators to a large scope of applications including databases, medical informatics, and big data analytics. However, despite their performance benefit and cost effectiveness, the utilization of GPUs in production systems still remains limited. A […]
Apr, 12

Real-Time Computation of Parameter Fitting and Image Reconstruction Using Graphical Processing Units

In recent years graphical processing units (GPUs) have become a powerful tool in scientific computing. Their potential to speed up highly parallel applications brings the power of high performance computing to a wider range of users. However, programming these devices and integrating their use in existing applications is still a challenging task. In this paper […]
Apr, 9

GIFT: A Real-time and Scalable 3D Shape Search Engine

Projective analysis is an important solution for 3D shape retrieval, since human visual perceptions of 3D shapes rely on various 2D observations from different view points. Although multiple informative and discriminative views are utilized, most projection-based retrieval systems suffer from heavy computational cost, thus cannot satisfy the basic requirement of scalability for search engines. In […]
Apr, 9

Monte-Carlo Black-Scholes Implementation using OpenCL Standard

The OpenCL is a standard parallel language which is based on C language. It offers users to take full advantage and also provide the flexibility of high level language. In this paper, we explore the use of OpenCL language to implement the complex design on FPGAs by describing the design with high level abstraction language. […]
Apr, 9

Optimizing Performance of Recurrent Neural Networks on GPUs

As recurrent neural networks become larger and deeper, training times for single networks are rising into weeks or even months. As such there is a significant incentive to improve the performance and scalability of these networks. While GPUs have become the hardware of choice for training and deploying recurrent models, the implementations employed often make […]
Apr, 9

dMath: A Scalable Linear Algebra and Math Library for Heterogeneous GP-GPU Architectures

A new scalable parallel math library, dMath, is presented in this paper that demonstrates leading scaling when using intranode, or internode, hybrid-parallelism for deep-learning. dMath provides easy-to-use distributed base primitives and a variety of domain-specific algorithms. These include matrix multiplication, convolutions, and others allowing for rapid development of highly scalable applications, including Deep Neural Networks […]
Apr, 9

The 8th International Conf. on Signal Processing Systems (ICSPS), 2016

Publication Accepted papers will be published in the conference proceedings, which will be indexed by EI Compendex; SCOPUS; ULRICH’s Periodicals Directory; INSPEC; etc. Agenda November 21, 2016 – Registration & Conference Materials Collection November 22, 2016 – workshop November 23, 2016 – Keynote Speeches & Participants’ Oral Presentation November 24, 2016 – Academic Visiting The […]
Apr, 9

Internaitonal Conf. on Biomedical Signal and Bioinformatics (ICBSB), 2016

Schedule November 21, 2016 (Monday) Participants Onsite Registration & Conference Materials Collection November 22, 2016 (Tuesday) Opening Ceremony and Keynote Speeches Participants’ Oral Presentation Excellent Paper Awards Ceremony & Dinner Banquet November 23, 2016 (Wednesday) Academic Visit November 24, 2016 (Thusday) Tutorial Registration Tutorial   Conference Venue The Sir Paul Reeves Building at AUT Address: […]
Apr, 9

5th International Conf. on Bioinformatics and Biomedical Science (ICBBS), 2016

ICBBS 2016 Shining Points: 1.Accepted and published papers can be indexed by Embase (Under elsevier) and other data base. 2.Three Outstanding Professors from local Indonesia, Thailand and USA have joined as Keynote Speakers. They are Prof. Tjokorda Gde Tirta Nindhia from Udayana University, Indonesia, Prof. Orawan Siriratpiriya from Environmental Research Institute of Chulaongkorn University, Thailand, […]
Page 4 of 866« First...23456...102030...Last »

* * *

* * *

Follow us on Twitter

HGPU group

1863 peoples are following HGPU @twitter

Like us on Facebook

HGPU group

406 people like HGPU on Facebook

HGPU group © 2010-2016 hgpu.org

All rights belong to the respective authors

Contact us: